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ABSTRACT 



Floating point hardware register set is not given to any user 
level thread unless it is required to perform floating point 
operations. Thus, for any non-floating thread, its context 
does not include the floating point hardware state. This 
effectively reduces the amount of information to be handled 
when threads arc swapped in the processor. During the 
course of a thread's execution, at the first instance of an 
attempt by the thread to execute a floating point instruction, 
the "float-unavailable" exception occurs. This, in turn, 
invokes the microkerael's floating point exception handler. 
The function of this exception handler is to make floating 
point available to the thread that requires it The exception 
handler dynamically allocates space for saving the thread's 
floating point registers, initializes the registers, and turns on 
the "float-available" bit in its machine state register. Once a 
thread obtains floating point context, it continues to have it 
for the remainder of its life. 

2 Qaims, 11 Drawing Sheets 
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FIG. 1A 
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FIG. 1B 

IFOUR LE VELS OF EXCEPTION PROCESSING IN THE EXCEPTION HANDLER 190 

t . — I 

I 

V 

1 1) THE FIRST LEVEL INTERRUPT HANDLER (FLIH) (FIG. 2) 

! 

v_ 

[ll) THE P RE-SECOND LEVEL INTERRUPT HANDLER (CALU-SLIHQ) (FIG.lf 

I 

V ■ 



iii) THE SECOND LEVEL INTERRUPT HANDLER (SLIH) (FIG. 4) 



V 



IV) THE EXIT HANDLER (FIG. 5) 
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FIG. 2 FIRST LEVEL INTERRUPT HANDLER 



I) USING THE SPECIAL PURPOSE REGISTERS, SPRG0-SPRG1 , GPR2 AND 
GPR3 ARE SAVED. 



V 



il) GPR3 IS SET TO THE PHYSICAL ADDRESS OF GPU_VAR STRUCTURE. 

I 

V 



lijj GPR2 IS SET TO VM_KERNEL_PHYS_SEG UPPER TO BE USED BY 
FLIH PANIC. 



IV) GPR4 AND GPRS ARE SAVED INTO THE CPU_VAR'S SCRATCH FAST 
SAVE AREA. 



I 

V 



V) SAVE SRRO AND SRR1 IN GPR4 AND GPRS 



VI) PREPARE FOR AND JUMP TO HIGH MEMORY. NOTE THAT CALL_SLIH IS 
IN VIRTUAL HIGH MEMORY. 



VII) GPR2 IS SET TO THE TOC OFFSET OF CALL.SLIH FROM THE CPU_VARS 
STRUCTURE. 



VIII) SRRO (I. E) lAR IS SET TO THE VALUE OF GPR2. 



IX) GPR3 IS SET TO THE ADDRESS OF THE SAVE STATE AREA. 

— - J — — . 

I 

V 



X) TRANSLATIONS ENABLED. I. E SRR1 IS SET TO MSRJR AND MSR_DR. 

I 

V 



XI) GPR2 IS SET TO THE ACTUAL SLIH ENTRY TOC OFFSET. 



V 



XII) PERFORM AN RFI. 
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FIG. 3 



PRE-SECOND LEVEL INTERRUPT HANDLER (CALL_SLIH()). 



I) SAVE THE REST OF THE MACHINE STATE TO THE STATE SAVE AREA IN 
THE CPU_VARS STRUCTURE. 

THE TOC REGISTER HAS BEEN INITIALIZED. GPR2 HAS KERNEL TOC. 

ALL THE STATE HAS BEEN SAVED INTO THE PPC SAVE STATE, 

STRUCTURE POINTED TO BY GPR3. 

GPR4 CONTAINS THE VALUE OF SRR1 (OLD MSR) 

GPRS CONTAINS THE VALUE OF DSISR 

GPR6 CONTAINS THE VALUE OF DAR 

GPR7 CPU_VARS POINTER 

GPR10 CONTAINS THE SLIH TOC ENTRY POINT. 

y 

Ti) EXAMINE MSR[PR1 BIT TO DETERMINE IF THIS EXCEPTION CAUSED A 
USER-TO-KERNEL OR KERNEL TO KERNEL TRANSITION TO KNOW WHETHER 
THE KERNEL STACK TO BE RESTORED IN R1 AND CHOOSE THE 
APPROPRIATE EXIT ROUTINE TO RETURN. IF THE PR BIT WAS SET TO 1. THE 
ADDRESS OF LOADJ^ND_GO_USER IS LOADED INTO THE GPR14. 
OTHERWISE LOAD Rl TO POINT TO THE KERNEL STACK FROM CPU_VARS 

STR UCTURE POINTED TO BY GPR7. 

V 

lii) ALLOCATE NEW PPC SAVED.STATE AREA ON THE KERNEL STACK AND 

BACK CHAIN IT TO THE PREVIOUS PPC_SAVED_STATE ON THE LIST. 

V 

W) UPDATE THE NEXT POINTER CV_NEXT_RSS IN THE RSS CHAIN TO 
POINT TO T HE NEW PPC_SAVED_STATE IN THE CPU_VARS STRUCTURE. 

V 

[V ) ALLOCATE A C FRAME ON THE STACK AND NULL BACK CHAIN IT. ~ 

— V 

m LOAD GPR13 WITH THE PPC_SAVED_STATE CURRENTLY IN GPR3. 

(GPR13 BEING NON VOLATILE AND GPR3 IS NOT NON-VOLATILE ACCORDING 
TO LINKAGE CONVENTIONS) . 

V ~ 
Vli) MOVE GPR10 TO COUNTER REGISTER. NOW THE COUNTER REGISTER 
S HOULD HAVE ADDRESS TO SLIH. 

V 

IVIII) BRANCH THROUGH THE COUNT REGISTER TO THE SLIH. 
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FIG. 4 



SECOND LEVEL INTERRUPT HANDLER (SLIH). 



THE SLIH IS GENERALLY A 'C ROUTINE THAT ACTUALLY HANDLES THE 
EXCEPTION AND RETURNS. IT IS ENTIRELY LEFT TO THE DISCRETION OF THE 
SLIH TO HAVE THE EXTERNAL INTERRUPTS RE-ENABLED WHILE THEY ARE 
EXECUTING. THE SUH WILL NOT RETURN TO THE CALL_SLIH() ROUTINE 
BECAUSE CALL_SLIH() LOADED THE CORRECT EXIT ROUTINE IN THE UNK 
REGISTER PRIOR TO CALLING THE SLIH. 



06/11/2004, EAST Version: 1.4.1 



U.S. Patent Feb. 25, 1997 sheet 7 of 11 5,606,696 



FIG. 5 



EXIT HANDLER. 

I) DISABLE EXTERNAL INTERRUPTS. . 

V 

Ti) IF RETURNING TO USER MODE. CHECK THE GLOBAL VARIABLE 
NEED_AST. IF SET. RE-ENABLE EXTERNAL INTERRUPTS AND CALL THE AST 
HANDLER.ASTO. WHEN DONE. RETURN TO THE TOP OF EXIT HANDLER 

LOAD_AND_GO_USER(). 

V 

Tii) BAT3 IS ENABLED, ESTABLISHING A VIRTUAL EQUALS REAL MAPPING 
FOR LOW MEMORY. TO TURN OFF TRANSLATIONS. THE KERNEL MUST BE 
EXECUTING IN LOW MEMORY. A MAPPING OF LOW MEMORY MUST BE 

PRESENT TO DO THE JUMP. 

V__ 

IV) STORE CURRENT PPC_SAVED_STATE TO THE PER CPU VARIABLE 
CV_NEXT_RSS. THIS CAUSES THE CURRENT PPC_SAVED_STATE AREA TO BE 

RE-USED ON THE NEXT EXCEPTION. 

_V 

V) RESTORE TWO REGISTERS TEMPORARILY IN TWO SPRS SO THAT 
THERE IS ROOM TO STORE SRRO AND SRR1. 

V 

Vi) PUT THE PHYSICAL ADDRESS OF SCRATCH SAVE AREA/CPU_VARS 

INTO R1 

V 

VII) RESTORE THE REMAINING MACHINE STATE. 

_V_ 

VIII) JUMP DOWN TO LOW MEMORY. HERE IT JUMPS TO PHYSICAL 
ADDRESS 0X3000 WHERE THE LOW_ADDRESS_RFI() IS. THIS ROUTINE DOES 
THE FOLLOWING 

DISABLE TRANSLATIONS. 

RESTORE SRRO AND SRR1 FROM THE SCRATCH REGISTERS. 
RESTORE THE TWO SCRATCH REGISTERS FROM THE SPECIAL 
PURPOSE REGISTERS. 

RFI - RETURN FROM INTERRUPT 
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FIG. 8 

ALIGNMENT EXCEPTION HANDLER 194 



1) ENTRY AT PHYSICAL ADDRESS 0X600. I 
V 

2) TEMPORARILY SAVE A WORK REGISTER INTO SPR_GO. | 

V 

3) GET ADDRESS OF CPU_VARS. FH_SAVE_AREA FROM THE SPR.CPU I 

REGISTER. I 

V " 

4) CONVERT VIRTUAL ADDRESS OF FH_SAVE_AREA INTO A PHYSICAL 
ADDRESS. 

V_ 

5) SAVE REGISTERS USED OR AFFECTED BY EXCEPTION HANDLER 
(GPR25 THROUGH GPR31. LR. CR. XER. SRRO. AND SRR1). 

V 

6) MOVE COPIES OF DSISR. DAR. AND MSR INTO WORK REGISTERS. | 

y - 

7) ASSERT THAT PROCESSOR WAS IN PROBLEM MODE AT TIME OF 
EXCEPTION. 

y 

8) CHECK ADDRESS BOUNDS OF OPERATION AGAINST KERNEL 

VIRTUAL ADDRESS SPACE. 

y 

9) MOVE DSISR INTO CR FOR BIT TESTS. | 
y 

10) BRANCH INTO INSTRUCTION DECODE (DSISR) TABLE BASED ON 
DSISR[1S-21] 

y 

11) EXECUTE APPROPRIATE SUBMODULE (SUBMODULE DESCRIPTIONS 

ARE GIVEN IN THE FOLLOWING SUBMODULES SECTION) 

y 

12) RESTORE SAVED STATE AND RETURN TO USER MODE. 
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EXCEPTION HANDLING METHOD AND 
APPARATUS FOR A MICROKERNEL DATA 
PROCESSING SYSTEM 

This application is a continuation of U.S. patent appli- 5 
cation Ser. No. 08/303,796, filed Sep. 9. 1994, now U.S. PaL 
No, 5,481,719. 



FIELD OF THE INVENTION lo 

The invention disclosed broadly relates to data processing 
systems and more particularly relates to improvements in 
operating systems for data processing systems. 

15 

RELATED PATENT APPLICATIONS 

The invention disclosed herein is related to the copending 
U.S. patent application Sen No. 263,710, by Guy G. Soto- 
mayor. Jr., James M. Magee, and Freeman L, Rawson, III, 20 
which is entitled "METHOD AND APPARATUS FOR 
MANAGEMENT OF MAPPED AND UNMAPPED 
REGIONS OF MEMORY IN A MICROKERNEL DATA 
PROCESSING SYSTEM", filed Jun. 21, 1994, IBM Docket 
Number BC9-94-053, assigned to the International Business 25 
Machines Corporation, and incorporated herein by refer- 
ence. 

The invention disclosed herein is also related to the 
copending U.S. patent application Scr. No. 263.313, by 
James M. Magee, et al. which is entitled "CAPABILITY 30 
ENGINE METHOD AND APPARATUS FOR A MICRO- 
KERNEL DATA PROCESSING SYSTEM", filed Jun. 21, 
1994, IBM Docket Number BC9-94-071, assigned to the 
International Business Machines Corporation, and incorpo- 
rated herein by reference. 

The invention disclosed herein is also related to the 
copending U.S. patent application Ser. No. 263.633, by 
James M. Magee. et al, which is entitled 'TEMPORARY 
DATA METHOD AND APPARATUS FOR A MICROKER- 
NEL DATA PROCESSING SYSTEM", filed Jun. 21, 1994, 
IBM Docket Number BC9-94-076, assigned to the Interna- 
tional Business Machines Corporation, and incorporated 
herein by reference. 

The invention disclosed herein is also related to the 
copending U.S. patent application Ser. No. 263,703, by 
James M. Magee, et al. which is entitled "MESSAGE 
CONTROL STRUCTURE REGISTRATION METHOD 
AND APPARATUS FOR A MICROKERNEL DATA PRO- 
CESSING SYSTEM", filed Jun, 21. 1994, IBM Docket 
Number BC9-94-077, assigned to the International Business 
Machines Corporation, and incorporated herein by refer- 
ence. 

The invention disclosed herein is also related to the 
copending U.S. patent application Ser. No. 263,709, by 55 
James M, Magee, et al. which is entitled "ANONYMOUS 
REPLY PORT METHOD AND APPARATUS FOR A 
MICROKERNEL DATA PROCESSING SYSTEM", filed 
Jun. 21, 1994, IBM Docket Number BC9-94-080, assigned 
to the International Business Machines Corporation, and 50 
incorporated herein by reference. 

The invention disclosed herein is also related to the 
copending U.S. patent application Ser. No. 08/281,217. by 
Aziza Bushra Faniqi, et al. which is entitled "SEPARATION 
OF TRANSMISSION CONTROL METHOD AND APPA- 65 
RATUS FOR A MICROKERNEL DATA PROCESSING 
SYSTEM", filed Jul 27. 1994, IBM Docket Number BC9- 
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94-081 XX. assigned to the International Business Machines 
Corporation, and incorporated herein by reference. 

The invention disclosed herein is also related to the 
copending U.S. patent application Ser. No. 08/303,005, by 
Ram K. Gupta, Ravi Srinivasan, Dennis Ackerman. and 
Himanshu Desai which is entiUed "PAGE TABLE ENTRY 
MANAGEMENT METHOD AND APPARATUS FOR A 
MICROKERNEL DATA PROCESSING SYSTEM", filed 
Sep. 9. 1994, IBM Docket Number BC9-94-073, assigned to 
the International Business Machines Corporation, and incor- 
porated herein by reference. 

BACKGROUND OF THE INVENTION 

The operating system is the most important software 
rurming on a computer. Every general purpose computer 
must have an operating system to run other programs. 
Operating systems typically perfonn basic tasks, such as 
recognizing input from the keyboard, sending output to the 
display screen, keeping track of files and directories on the 
disc, and controlling peripheral devices such as disc drives 
and printers. For more complex systems, the operating 
system has even greater responsibilities and powers. It 
makes sure that different programs and users running at the 
same time do not interfere with each other. The operating 
system is also typically responsible for security, ensuring 
that unauthorized users do not access the system. 

Operating systems can be classified as multi-user operat- 
ing systems, multi-processor operating systems, multi-task- 
ing operating systems, and real-time operating systems. A 
multi-user operating system allows two or more users to run 
programs at the same time. Some operating systems permit 
hundreds or even thousands of concurrent 'users. A multi- 
processing program allows a single user to run two or more 
programs at the same time. Each program being executed is 
called a process. Most multi-processing systems support 
more than one user. A multi-tasking system allows a single 
process to run more than one task. In common terminology, 
the terms multi-tasking and multi-processing are often used 
interchangeably even though they have slightly diflferent 
meanings. Multi-tasking is the ability lo execute more than 
one task at the same time, a task being a program. In 
multi-tasking, only one central processing unit is involved, 
but it switches from one program to another so quickly that 
it gives the appearance of executing all of the programs at 
the same time. There are two basic types of multi-tasking, 
preemptive and cooperative. In preemptive multi-tasking, 
the operating system parcels out CPU time slices to each 
program. In cooperative multi-tasking, each program can 
control the CPU for as long as it needs it. If a program is not 
using the CPU however, it can allow another program to use 
it temporarily. For example, the OS/2 (TM) and UNIX (TM) 
operating systems use preemptive multi-tasking, whereas 
the Multi-Finder (TM) operating system for Macintosh 
(TM) computers uses cooperative multi-tasking. Multi-pro- 
cessing refers to a computer system's abUity to support more 
than one process or program at the same time. Multi- 
processing operating systems enable several programs to run 
concurrently. Multi-processing systems are much more com- 
plicated than single-process systems because the operating 
system must allocate resources to competing processes in a 
reasonable manner. A real-lime operating system responds to 
input instantaneously. General purpose operating systems 
such as DOS and UNIX are not real-time. 

Operating systems provide a software platform on top of 
which application programs can run. The application pro- 
grams must be specifically written to run on top of a 
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particular operating system. The choice of the operating 
system therefore determines to a great extent the applica- 
tions which can be run. For IBM compatible personal 
computers, example operating systems arc DOS, OS/2 
(TM). AIX (TM). and XEr^X {TMl 5 

A user normally interacts with the operating system 
through a set of commands. For example, the DOS operating 
system contains commands such as COPY and RENAME 
for copying files and changing the names of files, respec- 
tively. The commands are accepted and executed by a part iq 
of the operating system called the command processor or 
command line interpreter. 

There arc many different operating systems for personal 
computers such as CP/M (TM), DOS. OS/2 (TM). UNDC 
(TM). XENIX (TM). and AIX (TM). CP/M was one of the 
first operating systems for small computers. CP/M was 
initially used on a wide variety of personal computers, but 
it was eventually overshadowed by DOS. DOS runs on all 
IBM compatible personal computers and is a single user, 
single tasking operating system. OS/2, a successor to DOS, 
is a relatively powerful operating system that runs on IBM 
compatible personal computers that use the Intel 80286 or 
later microprocessor. OS/2 is generally compatible with 
DOS but contains many additional features, for example it 
is multi-tasking and supports virtual memory. UNIX and 
UNIX-based AIX run on a wide variety of personal com- ^ 
puters and work stations. UNIX and AIX have become 
standard operating systems for work stations and are pow- 
erful multi-user, multi-processing operating systems. 

In 1981 when the IBM personal computer was introduced 
in the United States, the DOS operating system occupied 
approximately 10 kilobytes of storage. Since that time, 
personal computers have become much more complex and 
require much larger operating systems. Today, for example, 
the OS/2 operating system for the IBM personal computers 3^ 
can occupy as much as 22 megabytes of storage. Personal 
computers become ever more complex and powerful as time 
goes by and it is apparent that the operating systems cannot 
continually increase in size and complexity without impos- 
ing a significant storage penalty on the storage devices ^ 
associated with those systems. 

It was because of this untenable growth rate in operating 
system size, that the MACH project was conducted at the 
Carnegie Mellon University in the 1980's. The goal of that 
research was to develop a new operating system that would 45 
allow computer programmers to exploit modem hardware 
architectures emerging and yet reduce the size and the 
number of features in the kernel operating system. The 
kernel is the part of an operating system that performs basic 
functions such as. allocating hardware resources. In the case 50 
of the MACH kernel, five programming abstractions were 
established as the basic building blocks for the system. They 
were chosen as the minimum necessary to produce a useful 
system on top of which the typical complex operations could 
be built externally to the kernel. The Carnegie Mellon 55 
MACH kernel was reduced in size in its release 3.0, and is 
a fully functional operating system called tiie MACH micro- 
kernel. The MACH microkernel has the following primi- 
tives: the task, the thread, the port, the message, and the 
memory object. 60 

The traditional UNIX process is divided into two separate 
components in die MACH microkernel. The first component 
is the task, which contains all of the resources for a group of 
cooperating entities. Examples of resources in a task are 
virtual memory and communications ports. A task is a 65 
passive collection of resources; it does not run on a proces- 
sor. 
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The tiiread is the second component of the UNIX process, 
and is the active execution environment. Each task may 
support one or more concurrentiy executing computations 
called threads. For example, a multi-threaded program may 
use one thread to compute scientific calculations while 
another thread monitors the user interface. A MACH task 
may have many Uireads of execution, all running simulta- 
neously. Much of the power of the MACH programming 
model comes firom the fact that all threads in a task share the 
task's resources. For instance, they all have the same virtual 
memory (VM) address space. However, each thread in a task 
has its own private execution stale. This state consists of a 
set of registers, such as general purpose registers, a stack 
pointer, a program counter, and a frame pointer. 

A port is the communications channel through which 
threads communicate with each other. A port is a resource 
and is owned by a task. A thread gains access to a port by 
virtue of belonging to a task. Cooperating programs may 
allow threads from one task to gain access to ports in another 
task. An important feature is that they are location transpar- 
ent. Tb's capability facilitates the distribution of services 
over a network without program modification. 

The message is used to enable threads in different tasks to 
communicate with each other. A message contains collec- 
tions of data which are given classes or types. This data can 
range from program specific data such as numbers or strings 
to MACH related data such as transferring capabilities of a 
port from one task to another, 

A memory objea is an abstraction which supports the 
capability to perform traditional operating system functions 
in user level programs, a key feature of the MACH micro- 
kernel. For example, the MACH microkernel supports vir- 
tual memory paging policy in a user level program. Memory 
objects are an abstraction to support this capability. 

All of these concepts are fundamental to the MACH 
microkernel programming model and arc used in the kernel 
itself. These concepts and other features of the Carnegie 
Mellon University MACH microkernel arc described in the 
book by Joseph Boykin. et al. "Programming Under 
MACH", Addison Wesscly Publishing Company, Incorpo- 
rated, 1993. 

Additional discussions of the use of a microkernel to 
support a UNIX personality can be found in the article by 
Mike Accetta, et al, **MACH: A New Kernel Foundation for 
UNIX Development", Proceedings of the Sununer 1986 
USENIX Conference, Atlanta, Ga. Another technical article 
on the topic is by David Golub, et al, "UNIX as an 
Application Program", Proceedings of the Summer 1990 
USENIX Conference, Anaheim, Calif, 

The above dted, copending patent application by Guy G. 
Sotomayor, Jr., James M. Magee, and Freeman L. Rawson, 
in, describes the nucrokerael system 115 shown in FIG. 1, 
which is a new foundation for operating systems. The 
microkernel system 115 provides a concise set of kernel 
services implemented as a pure kernel and an extensive set 
of services for building operating system personalities 
implemented as a set of user-level servers. The microkernel 
system 115 is made up of many server components that 
provide the various traditional operating system functions 
and that are manifested as operating system personalities. 
The microkernel system 115 uses a client/server system 
structure in which tasks (clients) access services by making 
requests of other tasks (servers) through messages sent over 
a communication channel. Since the microkernel 120 pro- 
vides very few services of its own (for example, it provides 
no file service), a microkernel 120 task must communicate 
with many other tasks that provide the required services. 
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The microkernel system 115 has, as its primary respon- 
sibility, the provision of points of control that execute 
instructions within a framework. In the microkernel 120, 
points of control are the threads, that execute in a virtual 
environment The virtual environment provided by the 5 
microkernel 120 consists of a virtual processor that executes 
all of the user space accessible hardware insmictions, aug- 
mented by emulated instructions (system traps) provided by 
the kernel; the virtual processor accesses a set of viitualizcd 
registers and some virtual memory that otherwise responds 
as does the machine*8 physical memory. All other hardware 
resources are accessible only through special combinations 
of memory accesses and emulated instructions. Of course, it 
is a physical processor that actually executes the instructions 
represented by the threads. 

Each physical processor that is capable of .executing 
threads is named by a processor control port Although 
significant in that they perform the real work, processors are 
not very significant in the microkernel, other than as mem- 
bers of a processor set. It is a processor set that forms the 
basis for the pool of processors used to schedule a set of 
threads, and that has scheduling attributes associated with it. 
The operations supported for processors include assigimient 
to a processor set and machine control, such as start and 
stop. 25 

One advanced technology processor that can take full 
advantage of the capabilities of the Microkernel System 115 
is the PowerPC (TM). The PowerPC is an advanced RISC: 
(reduced instruction set computer) architecture, described in 
the book: IBM Corporation, **The PowerPC Architecture", 30 
Morgan-Kaufmaim, San Francisco, 1994. Another descrip- 
tion of the PowerPC is provided in the article: Keith Dief- 
endorff. Rich Oehler, and Ron Hochsprung, ^'Evolution of 
the PowerPC Architecture", IEEE Micro, April 1994, pp. 
34-49. The PowerPC was designed with its architecture 35 
divided into three parts or **books." Book I deals with those 
features that will not change over time, such as the user 
instruction set architecture, instruction definitions, opcode 
assignments, register definitions, etc. Book 2 deals with 
those features important to the operation of the processor in 40 
a multiprocessing environment, such as the memory model, 
consistency, atomicity and aliasing. Book 3 deals with the 
operating environment architecture. These are features that 
are not directly visible to the user, but instead are the 
exclusive domain of the operating system. Vfiihin this part 45 
of the architecture is the definition of the virtual-to-physical 
address translation and the method of exception handling. 
Because Book 3 features are supervisor privileged, it is 
possible to design a PowerPC processor according to an 
entirely different set of Book 3 feamres, and yet maintain 50 
user application compatibility. 

However, there are several problems in adapting the 
microkernel 120 to the PowerPC processor. The microkernel 
120, while scheduling threads of tasks running on the 
system, has to save the context of the currently ruiming 55 
thread on the processor and restore the context of the thread 
that needs to start its execution. The context of a program is 
the environment (e.g.. privilege and relocation) in which the 
program executes. That context is controlled by the content 
of certain system registers and the address translation tables. 60 
Since the floating point hardware in PowerPC processors 
includes 32 floating point registers 64 bits long, and a 32 bit 
floating point status and control register, it is very inefficient 
to have the threads assume the entire hardware for their 
context when they are created. Such an approach leads to an 65 
expensive context switch even when the threads do not need 
floating point capability. 
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OBJECTS OF THE INVENTION 

It is therefore an object of the invention to provide 
improved efficiency in the operation of a processor running 
a microkernel operating system. 

It is another object of the invention to provide inpx)ved 
speed in the operation of a processor in a microkernel 
architecture. 

It is a further object of the invention to provide improved 
multiprocessor support for a PowerPC processor running a 
microkernel operating system. 

SUMMARY OF THE INVENTION 

These and other objects, features and advantages are 
accomplished by the exception handling method and appa- 
ratus disclosed herein. The floating point exception problem 
is solved by the lazy context restore feature of the exception 
handling invention. 

The invention begins by abating a thread in the memory 
without the floating point context indication in the thread's 
process control block (pcb). In accordance with the inven- 
tion, this win prevent the copying of the floating point 
registers of the processor on which the thread has been 
running, when iu execution is terminated after a fault or 
interrupt. 

While executing during a first occurring session, only 
fixed point (integer) operations will be carried out by the 
thread in the processor using the plurality of fixed point 
registers of the processor. 

When a fault or an interrupt occurs terminating the first 
session (context switch time), the thread is removed from 
execution in the processor and the contents of the fixed point 
registers are stored in the thread's process control block. In 
response to the stored indication of no floating point context, 
the contents of the plurality of floating point registers in the 
processor are not stored in the thread's process control 
block. This significantly improves the overall performance 
of the system. 

Later, when the thread's execution is restored during a 
second occurring session, either in the same processor, or in 
an alternate processor, the contents of the process control 
block are examined to determine the state of the floating 
point context indication. Since the indication is that the 
thread does not have the floating point context, only fixed 
point operations are to be carried out with the thread in the 
processor using the plurality of fixed point registers. Thus, 
there is no attempt to copy back from the thread's process 
control block, values to load into the processor's floating 
point registers. This provides is a significant improvement in 
the overall performance of the system. 

If the sequence of program instmctions being run by the 
thread attempts to execute a floating point instmction during 
the second session, the floating point exception handler is 
called. 

The exception handler stores an alternate indication in the 
processor's machine state register, that the floating point 
context is available for the thread. This enables the thread to 
perform floating point operations. The thread then resumes 
execution of the floating point instruction in the processor. 

If another fault or interrupt occurs, farcing a termination 
of the execution of the thread in the processor (context 
switch time), the thread is removed from tiie processor, 
terminating the second session. This time, the contents of 
botii Uie plurality of fixed point registers and tite plurality of 
floating point registers in the processor are stored in the 
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thread's process control block in response to the alternate 
indication in the machine state register, that it is enabled for 
floating point operations. The alternate indication in the 
machine state register of the processor is also copied into the 
thread's process control block. Thus, only those threads that 
are performing floating point operations have the floating 
point registers copied at the termination of the thread's 
execution session in the processor. 

Later, when the thread's execution is restored in during a 
third occurring session, either in the same processor, or in an 
alternate processor, the contents of the process control block 
are examined to deiennine the state of the floating point 
context indication. Since the indication is that the thread 
does have the floating point context, both floating point and 
fixed point operations are to be carried out with the thread 
in the processor using the plurality of floating point and fixed 
point registers. Thus,, the microkernel copies back from the 
thread's process control block, values to load into the 
processor's floating point registers, in addition to the values 
to load into the processors fixed point registers. The micro- 
kernel also copies back from the thread's process control 
block the floating point context indication, which it loads in 
the processor's machine state register. Thus, only those 
threads that are pcrfonning floating point operations have 
values copied out of their process control blocks to load into 
the processor's floating point registers at the restoration of 25 
execution of the threads in the processor 

The invention has the following advantages: 

a. Context switch duration is greatly reduced if an applica- 
tion has threads that do not need their floating point 
registers saved, since floating point hardware is only 
made available to a thread on demand. 

b. Since the entire context of the thread is saved in its 
process control block once it obtains floating point capa- 
bility, a thread can be scheduled across multiple proces- 
sors in a symmetric multiprocessing implementation of 
the microkernel. 

In this manner, the exception handling method and appa- 
ratus provides improved efiBciency in the operation of a 
processor running a microkernel operating system. 

In an alternate embodiment of the invention, if a processor 
has only one thread executing within it that has the floating 
point context, then the contents of that processor's floating 
point registers do not need to be saved when that thread is 
removed from the processor. If all other threads executing 
within that processor are not using the floating point regis- 
ters, the values loaded into those registers by the sole 45 
floating point thread remain untouched. In accordance with 
the invention, each processor maintains a data structure in 
the memory that stores the name of the sole floating point 
thread that is executing in the respective processor. Then, 
when a second thread having a floadng point context is to 
begin execution in the processor, the processor calls the 
floating point exception handler. The floating point excep- 
tion handler then copies the contents of the processor's 
floating point registers, gets the name of first thread from the 
data structure, and saves the copied values in the process 
control block for the named first thread. Then the second 
thread can begin execution in the processor, and can load its 
own values into the processor's floating point registers. In 
this manner, the contents of the floating point registers of the 
processor need not be saved at all, if there is only one 
floating point thread executing in that processor. 

For multiprocessor configurations, when the first thread is 
to resume execution in a different processor, the floating 
point exception handler is called to copy the contents of the 
floating point registers of the first processor, to those of the 
second processor, if the first thread was the sole floating 
point thread that was executing in the first processor. 
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BRIEF DESCRIPTION OF THE DRAWING(S) 

These and other objects features and advantages will be 
more fully appreciated with reference to the accompanying 
figures. 

FIG. 1 is a functional block diagram of the Microkernel 
System 115 in the memory 102 of the host multiprocessor 
100, showing how the microkernel and personality-neutral 
services 140 run multiple operating system personalities on 
a variety of hardware platforms, including the PowerPC 
processor. 

FIG. lA shows the PowerPC user register set. 
FIG. IB shows the major parts of the PowerPC exception 
handler 190. 

FIG. 2 shows a flow diagram of the first level interrupt 
handler, which is part of the PowerPC exception handler 
190. 

FIG. 3 shows a flow diagram of the pre-second level 
interrupt handler, which is part of the PowerPC exception 
handler 190. 

FIG. 4 shows a flow diagram of the second level interrupt 
handler, which is part of the PowerPC exception handler 
190. 

FIG. 5 shows a flow diagram of the exit handler, which is 
part of the PowerPC exception handler 190. 

RG. 6 shows a flow diagram of the lazy floating point 
exception handler 192, which is part of the PowerPC excep- 
tion handler 190. 

FIG. 7 shows a layout of the floating point status and 
control register. 

FIG. 8 shows a flow diagram of the alignment exception 
handler 194, which is part of the PowerPC exception handler 
190. 

DESCRIPTION OF THE ILLUSTRATIVE 
EMBODIMENT(S) 

Part A. The Microkernel System 

Section 1. Microkernel Principles 

FIG, 1 is a functional block diagram of the Microkernel 
System 115, showing how the microkernel 120 and person- 
ality-neutral services 140 run multiple operating system 
personalities 150 on a variety of hardware platforms. 

The host multi-processor 100 shown in HG, 1 includes 
memory 102 connected by means of a bus 104 to an 
auxiliary storage 106 which can be for example a disc drive, 
a read only or a read/write optical storage, or any other bulk 
storage device. Also cormected to the bus 104 is the I/O 
adaptor 108 which in turn may be connected to a keyboard, 
a monitor display, a telecommunications adaptor, a local 
area network adaptor, a modem, multi-media interface 
devices, or other I/O devices. Also connected to the bus 104 
is a first processor A, 110 and a second processor B, 112. The 
processors 110 and 112 are PowerPC (TM) processors, as 
described above. The example shown in FIG. 1 is of a 
symmetrical multi-processor configuration wherein the two 
uni-processors 110 and 112 share a common memory 
address space 102. Other configurations of single or multiple 
processors can be shown as equally suitable examples. The 
processors can be other types, for example, an Intel 386 
(TM) CPU, Intel 486 (TM) CPU, a Pentium (TM) processor, 
or other uni-processor devices. 

The memory 102 includes the microkernel system 115 
stored therein, which comprises the microkernel 120, the 
machine dependent code 125, the personality neutral ser- 
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vices (PNS) 140, and the penonality servers 150. la accor- 
dance with the invention, the machine dependent code 125 
includes the PowerPC exception handler 190. Included in 
the PowerPC excepdon handler 190 is the floating point 
exception handler 192 and the alignment exception handler 5 
194, Each processor maintains a data structure 196A for 
processor A 110 and data structure 196B for processor A 112 
in the memory 102 of FIG. 1. The data structure 196 A stores 
the name of the sole floating point thread that is executing 
in the respective processor A 110. Similarly, the data struc- 10 
ture 196B stores the name of the sole floating point thread 
that is executing in the respective processor B 112. The 
microkernel system 115 serves as the operating system for 
the application programs 180 stored in the memory 102. 

An objective of the invention is to provide an operating 15 
system that behaves like a traditional operating system such 
as UNIX or OS/2. In other words, the operating system will 
have the personality of OS/2 or UNIX, or some other 
traditional operating system. 

The microkernel 120 contains a small, message-passing 20 
nucleus of system software running in the most privileged 
state of the host multi-processor 100, 115 that controls the 
basic operation of the machine. The microkernel system 
includes the microkernel 120 and a set of servers and device- 
drivers that provide personality neutral services 140. As the 25 
name implies, the personality neutral servers and device 
drivers are not dependent on any personality such as UNIX 
or OS/2. They depend on the microkernel 120 and upon each 
other. The personality servers 150 use the message passing 
services of the microkernel 120 to communicate with the ^ 
personality neutral services 140. For example, UNIX, OS/2 
or any other personality server can send a message to a 
personality neutral disc driver and ask it to read a block of 
data from the disc. The disc driver reads the block and 
returns it in a message. The message system is optimized so 
that large amounts of data are transferred rapidly by manipu- 
lating pointers; the data itself is not copied. 

By virtue of its size and ability to support standard 
programming services and features as application programs, 
the microkernel 120 is simpler than a standard operating 
system. The microkernel system 115 is broken down mto 
modular pieces that are configured in a variety of ways, 
permitting larger systems to be built by adding pieces to the 
smaller ones. For example, each personality neutral server 
140 is logically separate and can be configured in a variety 
of ways. Each server runs as an application program and can 
be debugged using application debuggers. Each server runs 
in a separate task and errors in the server are confined to that 
task. 

FIG. 1 shows the microkernel 120 including the interpro- 
cess communications module (IPC) 122, the virtual memory 
module 124, tasks and threads module 126, Uie host and 
processor sets 128, I/O support and interrupts 130, and 
machine dependent code 125. 55 

The personality neutral services 140 shown in FIG. 1 
includes the multiple personality support 142 which 
includes the master server, irutialization, and naming. It also 
includes the default pager 144. It also includes the device 
support 146 which includes multiple personality support and go 
device drivers. It also includes other personality neutral 
products 148, including a file server, network services, 
database engines and security. 

The personality servers 150 are for example the dominant 
personality 152 which can be, for example, a UNIX per- 63 
sonality. It includes a dominant personality server 154 which 
would be a UNIX server, and other dominant personality 



10 

services 155 which would support the UNIX dominant 
personality. An alternate dominant personality 156 can be 
for example OS/2. Included in the alternate personality 156 
are the alternate personality server 158 which would char- 
acterize the OS/2 personality, and other alternate personality 
services for OS/2. 159. 

Dominant personality applications 182 shown in FIG. 1, 
associated with the UNIX dominant personality example, 
are UNIX-type applications which would run on top of the 
UNIX operating system personality 152. The alternate per- 
sonality applications 186 shown in FIG. 1, are OS/2 appli- 
cations which run on top of the OS/2 alternate personality 
operating system 156. 

FIG. 1 shows that the Microkernel System 115 carefully 
splits its implementation into code that is completely por- 
table from processor type to processor type and code that is 
dependent on the type of processor in the particular machine 
on which it is executing. It also segregates the code that 
depends on devices into device drivers; however, the device 
driver code, while device dependent, is not necessarily 
dependent on the processor architecture. Using multiple 
threads per task, it provides an application environment tfiat 
permits the use of multi-processors wiUiout requiring that 
any particular machine be a multi-processor. On uni-proces- 
sors, different threads run at different times. All of the 
support needed for multiple processors is concentrated into 
the small and simple microkernel 120. 

The above cited patent applications provide a more 
detailed description of the Microkernel System 115, includ- 
ing tiie architectural model, tasks, threads, ports, and inter- 
process commimications, and features of the microkerael 
120. The virtual envirormient provided by the microkernel 
120 consists of a virmal processor thai executes all of the 
user space accessible hardware instructions, augmented by 
emulated instructions (system traps) provided by the kernel; 
the virtual processor accesses a set of virtualized registers 
and some virtual memory that otherwise responds as does 
the machine's physical memory. All otiier hardware 
resources are accessible only through special combinations 
of memory accesses and emulated instmctions. Of course, it 
is a physical processor that actually executes the instructions 
represented by the threads. 

Each physical processor that is capable of executing 
threads is named by a processor control port. Although 
significant in that they perform the real work, processors are 
not very significant in the microkerael, other than as mem- 
bers of a processor set. It is a processor set that forms the 
basis for the pool of processors used to schedule a set of 
threads, and that has scheduling attributes associated with it. 
The operations supported for processors include assignment 
to a processor set and machine control, such as start and 
stop. 

FIG. 1 shows the PowerPC as the processor 110 and 112. 
The PowerPC, as described above, is an advanced RISC 
(reduced instruction set computer) architectm^e, described in 
the book: IBM Corporation, 'The PowerPC Architeaure". 
Morgan-Kaufmann, San Francisco, 1994. Another descrip- 
tion of the PowerPC is provided in the article: Keith Dief- 
endorff. Rich Oehler, and Ron Hochsprung, "Evolution of 
the PowerPC Architecture", IEEE Micro, April 1994, pp. 
34-49. 

FIG. lA shows the PowerPC user register set, including 
the condition register CR, the link register LR, the count 
register CTR, the 32 general purpose registers GPR 00 to 
GPR 31. the fixed point exception register XER, the 32 
floating point registers FPR 00 to FPR 31, and the floating 
point status and control register FPSCR. 
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An exception is an cnx)r. unusual condition, or external 
signal, that may set a status bit and may or may not cause an 
interrupt, depending upon whether or not the corresponding 
interrupt is enabled. 

To transparently process an exception, the machine state 5 
must be saved, the exception fully decoded, the exception 
handled, the machine state restor^ and control returned to 
where the exception occurred. There are four levels of 
exception processing in the PowerPC exception handler 190, 
as shown in FIG. IB. lO 

i) the first level interrupt handler (FLIH) of FIG. 2. 

ii) the pre-second level interrupt handler (call_slih( )) of 
FIG. 3. 

iii) the second level interrupt handler (SLIH) of FIG. 4. 

iv) the exit handler of FIG. 5. 

Processing for most exceptions do conform to the above- 
mentioned four-step approach. However few exceptions 
require to be processed as quickly as possible in order that 
the overall system performance does not get affected. One of 
such exceptions is alignment related and is discussed in 
detail in subsequent sections. 20 

Exception Processing Steps 

This section dwells on the details involved in the four 
levels of exception processing. Although most of the excep- 
tions on the PowerPC arc processed this way, few exceptions 
owing to their very nature and performance reasons arc not 25 
handled strictly according to the general four step processing 
model. 

First Level Interrupt Handler is shown in the flow diagram 
of FIG. 2, which is part of the PowerPC exception handler 
190. 30 

The FLIH is responsible for all the low-level machine 
setup so that the kernel can run. This includes turning on 
translations and jumping to high memory. The FLIH must do 
any decoding of the exception to completely define and load 
it into a common location so that the next handler will know 35 
what routines to call. 

The flow diagram of FIG. 2 has the following steps: 

i) using the special purpose registers, SPRGO-SPRGl, 
GPR2 and GPR3 are saved. 

ii) GPR3 is set to the physical address of CPU_„VAR 40 
structure. 

iii) GPR2 is set to VM_KERNEL_PHYS_SEG upper to 
be used by FLIH_PANIC. 

iv) GPR4 and GPRS are saved into the CPU_VAR's scratch 
fast save area. 45 

v) save SRRO and SRRl in GPR4 and GPRS 

vi) prepare for and jump to high memory. Note that call_slih 
is in virtual high memory. 

vii) GPR2 is set to the TOC offset of call_slih from the 
CPU_VARS structure. 50 

viii) SRRO (i. e) lAR is set to the value of GPR2. 

ix) GPR3 is set to the address of the save stale area. 

x) translations enabled, i. c SRRl is set to MSR_IR and 
MSR_DR. 

xi) GPR2 is set to the actual SLIH entry TOC offset. 55 

xii) perform an rfi. 

FIG. 3 is a flow diagram of The Pre-Second Level 
Interrupt Handler (call_slih( )), which is part of the Pow- 
erPC exception handler 190. 

This routine has no knowledge of what exception has 60 
occured. Its purpose is to save any remaining stale and do the 
common stack manipulations prior to calling the SLIH, The 
call_slih( ) routine determines whether the stack currently 
pointed to by rl is a kernel slack Once a kernel stack is 
guaranteed to be in rl,call_slih( ) allocates a ppc„saved_ 65 
stale and a c-frame on the slack. The call_slih( ) then 
branches to the address of SLIH saved by the FLIH. 
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FIG. 3 has the following steps: 

i) save the rest of the machine state to the state save area in 
the CPU_VARS structure. 

The TOC register has been inidalized. GPR2 has the 
Kernel TOC. 

All the state has been saved into the PPC save slate, 
structure pointed to by GPR3. 
GPR4 contains the value of SRRl (old msr) 
GPRS contains the value of DSISR 
GPR6 contains the value of DAR 
GPR7 cpu__vars pointer 
GPRIO contains the SLIH TOC entry point. 

ii) examine MSR[PR] bit to determine if this exception 
caused a user-to-kemel or kernel to kernel transition to 
know whether the kernel stack to be restored in rl and 
choose the appropriate exit routine to return. If the PR bit 
was set to 1 , the address of load_and__go_user is loaded 
into the GPR14. Otherwise load ri to point to the kernel 
stack fipom CPU_VARS structure pointed to by GPR7. 

iii) allocate new ppc_saved__slate area on the kernel stack 
and back chain it to the previous ppc_saved_state on the 
list. 

iv) update the next pointer cv_next__rss in the rss chain to 
point to the new ppc_saved_state in the CPU_VARS 
structure. 

v) allocate a c frame on the stack and null back chain it. 

vi) Load GPR13 with the PPC_SAVED_STATE cunently 
in GPR3. (GPR13 being non volatile and GPR3 is not 
non-volatile according to linkage conventions) 

vii) Move GPRIO to Counter Register. Now the Counter 
Register should have address to SLIH. 

viii) branch through the count register to the SLIH. 

FIG. 4 shows the Second Level Interrupt Handler (SLIH), 
which is part of the PowerPC exception handler 190. 

The SLIH is generally a *C' routine that actually handles 
the exception and returns. It is entirely left to the discretion 
of the SLIH to have the external interrupts re-enabled while 
they are executing. The SLIH will not return to the cali_ 
shh( ) routine because call_slih( ) loaded the correct exit 
routine in the link register prior to calling the SLIH. 

FIG. 5 shows the Exit Handler, which is pari of the 
PowerPC exception handler 190. 

This is the final phase of exception processing. The exit 
handler could be either load_go_sys (it was a kernel lo 
kernel transition) or Ioad_go_user (user lo kernel transi- 
tion). The only difference between these two routines is that 
in load_go_user rou-tine, the asynchronous system traps 
arc processed. 

FIG. 5 has the following steps: 

i) disable external interrupts. 

ii) if reluming to user mode, check the global variable 
need_asL If set, re-enable external interrupts and call the 
AST handler, ast( ). When done, return to the top of exit 
handler load_and_go_user( ). 

iii) BAT3 is enabled, establishing a virtual equals real 
mapping for low memory. Tb mm off translations, the 
kernel must be executing in low memory. A mapping of 
low memory must be present to do the jump. 

iv) store current ppc_saved_siate to the per CPU variable 
cv_next_rss. This causes the current ppc_saved_stale 
area to be re-used on the next exception. 

v) restore two registers temporarily in two SPRs so that there 
is room to store SRRO and SRRl. 

vi) Put the physical address of scratch save area/CPU„ 
VARS into rl 

vii) restore the remaining machine state. 

viii) jump down to low memory. Here it jumps to physical 
address 0x3000 where the LOW_ADDRESS_Rn( ) is. 
This routine does the following disable translations. 
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restore SRRO and SRRl from the scratch registers, 
restore the two scratch registers from the special purpose 
registers. 

rfi — Return from intemipl ^ 
State transitions 

The PowerPC can be executing in one of two states. 

kernel 

user 

The exceptions always force a transition to the kernel 
mode no matter what mode the processor was in at the time 
the exception occurred. When an exception is accepted, 
execution resumes in kernel mode and in order for the kernel 
to execute properly, it should be executing on its own stack. 
If the processor was executing in user mode at the time the 
exception happened, then the stack pointer points to the 
stack of the user thread and hence the stack needs to be 
switched. A kernel to kernel transition does not require the 
stack switch. 20 

The exception processing is dependent on this state tran- 
sition. If user to kernel transition has occurred, then the 
errors are reported through the thread's exception port. But 
any fatal errors such as an illegal or alignment exceptions 
occurring in the context of a kernel to kernel transitions are 25 
reported through a panic mechanism which halts the system. 

The exit path out of exception is dependent on the 
transition. In the case of kernel to kernel transition, it exits 
out of the exception via the common exception path. But 
user to kernel mode transition may have to take care of some 
of the work that has been accumulated in the course of 
exiting from the exception to the user mode. For example a 
decrementer exception can occur if a second level interrupt 
handler had the external interrupts enabled. This means a 
context switch is necessary in the middle of a deeply nested 
exception. Instead of doing the context switch, a global 
kernel variable can be set to indicate that some external 
interrupts (Asynchronous System Traps) are pending to be 
processed. So on the way out of user to kernel exceptions. 40 
the occurrence of ASTs must be checked. 

Data structures 

PowerPC machine stale 

Exceptions arc inherently asynchronous. When they 
occur, control is transferred to the FLIH. lb transparently 
process the exception, there is a set of processor registers 
that must be preserved. This set of registers is referred to as 
machine state and it contains the following elements. 

i) GPRS 0-31 50 

ii) SRRO (address of where execution is to resume. It is 
called iar in the ppc„saved_state structure) 

iii) SRRl (low 16 bits hold the value of MSR at the lime of 
exception. The high 16 bits may contain information 
indicating the exact namre of exception. It is called msr in 55 
ppc_saved__siate) 

iv) Link Register 

v) Condition Register 

vi) Counier register 

vii) XER register eo 

viii) MQ register (This is available only on the PowerPC601 
implementation of the PowerPC architecture. This is to 
ensure that the machine state set is a superset of all the 
Power Architectures) 

ppc_kemel_siaie 65 
This corresponds to the stale of kernel registers as saved 
in a context-switch. It lives at the base of the kernel stack. 



typcdef struct ppc_kemel_5iBte { 

int ks_ss; /• picallocated ppc_satved_siaic */ 
int ks_»p; /• kernel Black pointer •/ 
int ks_lr; /• link register */ 
int ks_cr, f* condition code regiBter •/ 
int ks_rcgI3(191; t* non volatile registers rl3 - r3l */ 
int ks_pad; /* double word boundary */ 

}; 



cpu_vars 

This structure holds all of the per CPU global variables. 



typedef struct cpu_vars { 

i* these fields are read/write *l 

struct fh_save_area cv __fost^»ave; f* fast save area ♦/ 

ppc_stfltc_t cv_ncxt^as; /* next exception save area •/ 

ppc_staic_l cv_user_s5; /* user mode exception save area */ 

vin_off3Ct_t cv_Jceniel_stack; t* per cpu stack */ 

/*these fields arc read-only after imdalizadon */ 

unit cv__toc; /* TOC value */ 

vm_ofifscl_t cv_caIL_slih; /* address of common call_5lili( ) 
routine V 

vm_ofiFscl_t cv_dsisr_jt; /*physical address of DSISR jun^ 
table for alignment exc. handler */ 

int cv_cache_bs; /*cache block size in bytes */ 

im cv_cpu_numbcr_jx; /*cpu number inde:x */ 

int cv_cpu_number, /♦cpu numi»r */ 

struct q)tt_vars *cv__panic_slih; /*cpu_vars on which to run 
panic_filih( )*/ 

int cv_pad[6]; /* cache line alignment */ 
} cpu_vars_t; 



This structure is initialized in ppc_iml_stacks( ) routine. 
The fields cv_cpu„number, cv_cpu_number_ix,cv_toc, 
cv„call_slih and cv_panic_slih all hold constant values 
and can be thought of as read_only after ppc_init_stacks( 
) is complete. The other fields are dynamic. At the time of 
ppc_init^stacks( ), tiicrc is no notion of user or kernel 
stacks. To handle exceptions that may come in during tiiis 
time, kernel makes use of the panic stack as its run-time and 
exception stack. 

This data structure is accessible to each cpu, with each 
looking at tiieir personal copies. The SPRG3 register at the 
time of initialization, is made to point to cpu_vars in 
ppc_init_stack( ) routine. 

The ppc_saved_staie is pre-llocaied and cv_next_ss 
always points 10 this area. In user mode, it always points to 
the current thread's process control block (pcb). The 
cv_user_ss always points to the current_thread's pcb. A 
pointer to the bouom of the kernel stack of the thread is 
maintained in cv_kemel_stack. 

ppc_saved_siate 

lliis structure describes the machine state as saved upon 
kernel entry. One structure lives in the pcb of the thread and 
holds the user state saved at the initial transition from user 
to kernel mode. Additional structures representing nested 
exceptions or interrupts and live on the kernel stack. The first 
structure of which lives just above the ppc_kemel_state. 

The state save structures are pre-allocated. The variable 
cv_next_r8S in the per CPU structure always points to the 
save area that will be used at the next fault or intermpt. 
While running in user mode, it points to the pcb. The state 
save structures arc linked in a chain to enable stack tracking. 



typedef struct ppc_saved_£tate { 
im rcg5l32]; /*u5cn GPRS */ 
int iar /*user*s jnslruction address register */ 

int msr; /'user's machine state register */ 

im cr; /♦ user's condition register */ 
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-continued 



ini Ir, /♦users link register */ 

ini clr; /* user's count regis icr */ 

tni xcr; /* user's storage exception register 

int mq; /• user's mq register */ 

im ss_chain; /* pointer to previous exception io chain */ 

ini ss_reason; /* argument lo pr_8Uh( ) */ 

id ss_vaddr; 

int ss_cxtra; I* padding bytes to double word boundary */ 
) *ppc_slatc_t; 

floatsavc - floating point state structure 
typcdef floatsave { 

double fp_regs(32J; 

/* 32 64-bit Hoating point user registers */ 
long fp_duiiuny; 

/*32 bits of padding so fp_scr can be stfd/lfd */ 

long fp_3cr; /* floating point status and control register */ 
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FHoat Unavailable 
Introduction 

This section briefly explains all the Floating Point related 
exception scenarios in the PowerPC architecture. It also 
provides information as to how the microkernel perceives 
such exceptions in the context of an executing thread. It also 
furnishes PowerPC architecture specific details such as the 
bit settings etc. for each of the exception types. 

PowerPC Information 

A floating point unavailable exception occurs when no 
higher priority exceptions exist, an attempt is made to 
execute a floating-point instruction (including floating-point 
load, store and move instructions) and the floating point 
available bit in the MSR is disabled. (MSR[FP1=0). 

The register settings for floating point unavailable excep- 
tions are given below. 



pcb_l 

This structure holds the user-mode machine state associ- 
ated with a particular thread. The ppc_saved_state struc- 
ture is filled in on uunsition from user to kernel mode. The 
floatsave simcture is filled in lazily when some other thread 
needs floating point unit. 



typcdef struct pcb { 

struct ppc_saved_siaic pcb_ss; 

struct floatsavc pcb_fp; 

struct ppc_inachinc_sEate ims; 
} •pcb_t; 



fh_save_area 30 
This structure provides the scratch area for storing regis- 
ters GPR25-31, state save and restore regisers SRRO and 
SRRl and other registers LR. CR and XER. This is allocated 
in the CPU_VARS structure to be used by all the fast 
handlers that do not use the four-step exception processing 35 
scheme. Aligrmient exception handler is the only fast han- 
dler that uses this area in its own FLIH. 



struct fh_savc_arca { 

long fh_scratchO; 40 

long (h_scratchl; 

long fh_scralch2; 

long fh_scralch3; 

long ni_gpr25; 

long fh_gpr26; 

long fh_gpT27; 45 

long ni_gpr28; 

long ni_gpr29; 

long ni_gpr30; 

long fh_gpr3U 

long fb_srrO; 

longni_srrl; „ 
long ni_lr; 
long ni_cr; 
long fh_xcr, 

} 



Global variables 55 
The following are system global variables that are used 
primarily in exception processing. 

1. active_threads[ ] 

2, activc_stacksl ] 

They have elements for each CPU on the system. Each 60 
element points to the current thread and stack on that CPU. 

e. g active_threads[0] — current thread on cpu 0/* refer to 
current_thread( ) macro definition ♦/ 
[Note: Floating point exception handlers make use of 
another kernel variable called "float_thrcad" which points 65 
to the thread that has access to the floating point hardware] 

Floating Point Exceptions 



SRRO - Set to the cSbcdve address of the insouction thai caused 
the exception 
SRRl - 0-lS cleared 

16-31 Loaded from bits 16-31 of the MSR 
MSR EE 0 
0 

PRO FEO 
FP 0 EP not altered 

ME not altered IT 0 

FEOO DTO 



This exception type is veaored at 0x0800 in the exception 
vector table. When a floating point unavailable exception is 
taken, instruction execution resumes at offset 0x00800 from 
the physical base address indicated by MSR[EP]. 

Microkernel Information 

lazy context restore policy 

Floating point hardware register set is not given to any 
user level thread unless it is required to perform floating 
point operations. Thus, for any non-floating thread, the 
context does not include the floating point hardware state. 
This effectively reduces the amoiml of information to be 
handled during each context switch time. 

There arc 32 64-bit floating point registers and a 32-bit 
Floating point status and control register in 32-bit PowerPC 
processor implementations. These add upto 260 bytes of 
information that would be saved and restored during a 
context switch even if the threads do not use them. 

A thread, when it is created is given a context save area 
addressed as its PCB. The PCB consists of integer context 
and float-context save areas. Any thread created and sched- 
uled for execution does not have a float save area addressed 
by its pcb. The thread's MSR (machine state register) has a 
bit lo indicate the availability of floating point hardware to 
the thread. It is initially set to zero. 

During the course of a thread execution, at the first 
instance of an attempt by ±e thread to execute a floating 
point instruction, the float unavailable exception to occurs. 
This in turn causes the microkemers floating point excep- 
tion handler to be invoked. The function of this exception 
handier is to make floating point available to the thread that 
required it. TTie exception handler dynamically allocates 
space for saving the thread's floating point registers, initial- 
izes the registers and turns on the float-available bit to 1 in 
its machine state register (MSR). 

Once a thread obtains floating point context, it continues 
to have it during the remainder of its life. The flow chart of 
FIG. 6 illustrates the floating point exception handler 192, 
which is part of the PowerPC exception handler 190. 

The flow diagram of FIG, 6 starts by creating a thread in 
the memory 102 without the floating point context indication 
in the thread's process control block (pcb). In accordance 
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with the invention, this will prevent the copying of the and fixed point registers. Thus, the microkernel 120 copies 

floating point registers of the processor UO on which' the back from the thread's process control block, values to load 

thread has been running, when its execution is terminated into the processor's floating point registers, in addition to the 

after a fault or interrupt. values to load into the processor's fixed point registers. The 

While executing during a first occurring session, only 5 indication that the thread does have the floating point 

fixed point (integer) operations will be carried out by the context is copied back from the thread's process control 

thread in the processor using the plurality of fixed point block, to the processor's machine state register. Thus, only 

registers of the processor 110. those threads that are performing floating point operations 

When a fault or an interrupt occurs terminating the first have values copied out of their process control blocks to load 

session (context switch time), the thread is removed from lO into the processor's floating point registers at the restoration 

execution in the processor 110 and the contents of the fixed of execution of the thread's in the processor, 

point registers are stored in the thread's process control In an alternate embodiment of the invention, if a processor 

block. The contents of the processor's machine state register A 110 in FIG. 1 has only one thread execudng within it that 

(MSR), including the state of the cuirent floating point has the floating point context, then the contents of thai 

context in the processor, is stored in the thread's process 15 processor's floating point registers do not need to be saved 

control block. In response to the stored indication of no when that thread is removed from the processor. If all other 

floating point context, the contents of the plurality of floating threads executing within that processor 110 are not using the 

point registers in the processor are not stored in the thread's floating point registers, the values loaded into those registers 

process control block. This significantly improves the over- by the sole floating point thread remain untouched. In 

all performance of the system. 20 accordance with the invention, each processor maintains a 

Later, when the thread's execution is restored in during a data structure 196A for processor A 110 and data structure 

second occurring session, either in the same processor UO. 15^B for processor A 112 in the memory 102 of FIG. 1. The 

or in an alternate processor 112, the contents of the process data structure 196A stores the name of the sole floating point 

control block arc examined to determine the slate of the thread that is executing in the respective processor A UO. 

floating point context indication. Since the indication is that 25 Similariy, the data structure 196B stores the name of the sole 

the thread does not have the floating point context, only floating point thread that is executing in the respective 

fixed point operations are to be carried out with the thread processor B 112. Then, when a second thread having a 

in the processor using the plurality of fixed point registers. floating point context is to begin execution in the processor 

Thus, there is no attempt to copy back from the thread's A UO. the processor A UO calls the floating point exception 

process control block, values to load into the processor's 30 handler 192, The floating point exception handler 192 then 

floating point registers. This provides is a significant copies the contents of the processor's UO A floating point 

improvement in the overall performance of the system. registers, gets the name of first thread from the data structure 

The indication dial the thread does not have the floating 196A. and saves the copied values in the process control 

point context is copied back from the thread's process block for the named first thread. Then the second thread can 

control block, to the processor's machine state register. If the 35 begin execution in the processor A UO, and can load its own 

sequence of program instructions being run by the thread values into the processor's A UO floating point registers. In 

attempts to execute a floating point instruction during the this manner, the coiitents of the floating point registers of the 

second session, the floating point exception handler 192 is processor A UO need not be saved at all, if there is only one 

called by the processor (the instruction is trapped by the floating pomt thread executing in that processor A UO. 

microkernel 120). 40 For multiprocessor configurations, when the first thread is 

The exception handler 192 stores an alternate indication to resume execution in a different processor B U2, die 

in the processor's machine slate register that the floating floating point exception handler 192 is called to copy the 

point context is available for. the thread. This enables the contents of the floating point registers of the first processor 

thread to perform floalmg point operations. The thread then A UO, to those of die second processor B 112, if the first 

resumes execution of tiie floating point instmction. 45 thread was the sole floating point thread that was executing 

If another fault or interrupt occurs, forcing a termmation in the first processor A UO. 

of the execution of the thread in the processor (context Multiprocessing and performance 

switch time), the thread is removed from the processor, The lazy context restore policy is multiprocessor enabled, 

terminating the second session. This time, the contents of In the sense, the floating context is associated with the thread 

both the plurality of fixed point registers and the plurality of 50 executing as opposed to being to tied to a processor. In other 

floating point registers in die processor are stored in the words, earlier systems solved this problem by adopting a 

thread's process control block in response to the alternate lazy float context switch policy whereby only a thread owns 

indication that it is enabled for floating point operations. the floating point hardware at any time. 

Thus, only tiiose threads that are performing floating point In such a scheme, when a thread uaps into the kernel for 

operations have Uie floating point registers copied at die 55 getting the float context, the trap handler allocates and 

termination of the thread's execution session in the proccs- provides the thread floating point save area. It also desig- 

sor. The machine state register is copied to the tiiread's nates the thread as being the float diread of this processor. In 

process control block, including die floating point context the event, anotiier thread requires to use floating point 

status. hardware, it traps into the kernel. This time the trap handler 

Later, when the thread's execution is restored in during a 60 designates the new thread as die float thread for diis pro- 
third occurring session, either in the same processor UO. or cessor after saving the old float thread's floating point 
in an alternate processor 112, the contents of the process registers and restoring the new thread's floating point reg- 
control block are examined to determine the state of the isters. 

floating point context indication. Since the indication is that In a uniprocessor systems also, lazy context switch can be 

the thread does have the floating point context, both floating 65 expensive, particularly in float intensive applications since 

point and fixed point operations arc to be carried out with the the overall context switch time is increased by the exception 

Uiread in the processor using the plurality of floating point handling patii also. But widi lazy context restore policy. 
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because threads have the float context since obtaining it, all 
subsequent context switches would include both integer and 
floating point state. 

For a multi-processor system, the concept of tying a float 
thread to processor makes it difficult to obtain and move the 
state information across processors. With the lazy float 
restore policy, it is guaranteed that if a thread is a float 
thread, it has its latest floating point state information when 
it is ready to run on any processor. 

External Interface Details 

This section explains the interface details of the SLIH for 
the float_unavailable exception 
The name of the SLIH is float_unavailable( ). 
It is invoked as follows. 

void fioat_unavailable(struct ppc__saved_state *state) 
It expects the ppc_8aved_state to be passed to it by the 
pre-second level interrupt handler call_slih( ) routine. 
Data structures 

The following global data structures are affected by this 
routine. 

The thread data structure of the current thread in which 
this fault has occurred. This routine essentially changes the 
thread's machine state by changing the MSR bit settings in 
the thread's pcb, It also restores the thread's floating point 
context by loading the floating point registers from the 
thread's float save area in the pcb. 

Functional Description 

Roat Unavailable 

Function name: float_unavailable( ) 

Purpose: To handle fioat_unavailabIe exception that 
occurred in a thread. 

Prototype: void float_unavailable(struct ppc_saved_ 
state *state); 

Input: The machine state as saved upon kernel entry, 
output: none 
return values: none 
error codes: 

routines invoked: panic( ),float_load( ),float_store( ) 
Logic: 

If it has happened in the prcvileged/Supervisor mode then 

panic and quit 

Fetch the current thread 

Allocate float save area (260 bytes) and make thread's 
float save area pointer point to it 

Initialize all the registers. 

turn on its MSR[FP] in its pcb; 

load the floating point registers with the current thread's 
float save area; 

Errors and Messages 
1 ) Floating point unavailable in kernel mode 

Since kernel does not make use of floating point, this fault 
is not expected to occur in kernel mode. 

Floating Point Program Exceptions 

Introduction 

This section describes all the floating point program 
exceptions that can occur in the PowerPC architecture and 
how those exceptions are processed in the microkernel. It 
provides functional descriptions of all the routines that are 
related to the floating point enabled program exception 
handling. 

PowerPC information 

The control with regard to enabling and disabling the 
floating point program exceptions is provided in the Pow- 
erPC hardware both in the machine state register as well as 
in the Floating Point Status and Control register. Both the 
registers have floating point exception enable bits that need 
to be set to recognize and process these exceptions. FIG. 7 
illustrates the bit significance of FPSCR register. 
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A floating point program exception occurs when no higher 
priority exception exists and the following condition which 
correspond to bit settings in SRRl, occur during execution 
of an instruaion. 

System floating point enabled exception is generated 
when the following condition is met: 

(MSR[FE0]IMSR[FE1]) & FPSCR[FEX] is 1 

FPSCR[FEX] is set by the execution of a floating point 
instruction that causes an enabled exception or by the 
execution of a "move to FPSCR" type instruction that sets 
an exception bit when its corresponding enable bit is set In 
the MPC-601, all floating point enabled exceptions taken 
clear SRR1[15] to indicate that the address in SRRO points 
to the instruction that caused the exception because all 
floating point enabled exceptions are handled in a precise 
manner on the MPC601. 

Floating point exceptions are signalled by condition bits 
set in the floating point status and control register. They can 
cause the ^system floating point enabled exception error 
handler to be invoked. 

The following conditions that can cause program excep- 
tions are detected by the processor. These conditions may 
occur during execution of floating point arithmetic instruc- 
tions. The corresponding bits set are indicated in parenthe- 
ses. 

I) Invalid floating point operation exception (VX) 

i) sNaN (VXSNAN) 

ii) Inf— Inf (VXISi) 

iii) InWnf (VXIDI) 

iv) zero/zero (VXZDZ) 

v) Inf*zero (VXIMZ) 

vi) Illegal compare (VXVG) 

n) Software request condition (VXSOFT) 

III) Illegal integer convert 

IV) zero divide 

V) Overflow 

VI) Underflow 

VII) inexact 

The exception bit indicates occurrence of the correspond- 
ing condition. If a floating point exception occurs, the 
corresponding enable bit governs the results produced by the 
instruction and, in conjunction with bits FEO and FE 1, 
whether and how the system floating point enabled excep- 
tion handler is invoked. 

When an exception occurs, the instruction execution may 
be suppressed or a result may be delivered, depending on the 
exception type as well as if the exception is enabled or not 

Instruction execution is suppressed for 

i) enabled illegal floating point operation 

ii) enabled zero divide 

Default result is generated and written to the destination 
specified by the instruction causing the exception 

i) disabled and enabled overflow 

ii) disabled and enabled underflow 

iii) disabled and enabled inexact 

iv) disabled zero divide 

v) disabled illegal floating point instruction 

In the PowerPC architecture, setting enable bits causes the 
generation of the result value specified in the IEEE default 
behavior standard for the "trap enabled" case and if the 
enable bit is 0, it causes the generation of the default value 
specified for the "trap disabled" case. The "trap disabled" 
case is when both FEO and FEl are cleared in the MSR and 
all the enable bits are cleared in the FPSCR. If the program 
exception handler should notify the software that a given 
exception condition has occuaed, the corresponding FPSCR 
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enable bit must be set .and a mode other than Ignore 
exception mode should be selected. In the MPC601, both 
FEO and FEl arc 'OR'ed. Unless both are cleared, MPC601 
operates in precise mode. 

The MSR register bits FEO and FEl (bit positions 20 and 
23. Both of them) need to be on to enable the processor to 
execute in ''Synchronous precise mode". This ensures that 
all the Floating Point Program exceptions are recognized 
and the Floating Point Exception handler is invoked if they 
arc individually enabled through the control bits of the 
FPSCR, 

The standard default results may be satisfactory under 
most circumstances. This coupled with the performance 
optimization objectives, renders the Synchronous precise 
mode optional and to be used only for debugging and 
specialized applications. 

The program exceptions are vectored at *0x0700* in the 
vector table. 

The SRRO has the Effective Address of the insUuction that 
caused the exception 
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SRRl 0-lOclcaiod 

U - set to indicate a floating point enabled program exception 
12-15 cleared 

16-31 loiKJed from bits 16-31 of the MSR at the time the 
exception has occurred 

Microkernel information 

Once a thread attains the Floating Point capability, while 
cxccudng floating point instruction, can potentially cause 
synchronous floating point program exceptions if enabled 
for such faults. 

The system pr_slih handler is invoked by the FLIH for 
many exception conditions including Program Exceptions. 
The Floating point Enabled exceptions are such exceptions 
and arc handled by the pr_slih routine. 

Since these arc program instmction caused exceptions, it 
is adequate at the kernel level, the system pr_slih handler 
obtain the current floating point status of the faulting thread, 
format a floating point enabled program exception message 
and report it to the exception server. 

Additionally a kernel interface is provided to the appli- 
cations in order to set and get the hardware state for a 
specific thread within a task. These calls are provided to 
facilitate individual threads to have control and be able to 
manipulate the register settings and fetch the stams infor- 
mation. These calls are machine specific since they directly 
read and write into the thread's machine state save area. 
Actual details of the interface are explained in the following 
sections. 

External Interface 

pr_sHh function interface 

The system pr_slih handler is invoked in case of a 
floating point program exception as follows 



25 



30 



35 



40 



45 



30 



pr_6lih (stnicl ppc_8aved_6tatc •stale, 
long sttI, 
long dsisr, 
long dar) 



55 



where 

slate is machine state as saved upon kernel entry eo 

srrl is the segment register SRRl 

dsisr is the DSISR register settings when the exception 

occurred, 
dar is the data address register 

The pr_slih routine formats an exception message and 65 
raises an exception to the exception server by calling the 
exception routine 



exception(exc,codes, code size) 
where exc is the generic exception type 
codes is an array of values including register settings and 
so on 

codc_si2e is the no. of elements in the code array 
kernel — thread interface 

The kernel interface comprises of two slate related rou- 
tines namely thread_set_statc (thrcad_t thread,inl flavor, 
thread_state_t new_state, uint new_state_counl) 
where 

thread — thread for which the slate is to be altered 
flavor — machine specific flavor 
PPC_THREAD_STAr&— refers to tiie thread's machine 

context except FP 
PPC_FLOAT_STATE— refers to the thread's FP context 
new_state — new state 

count — ^no. of natural storage units for the state set 
thread_^et_slate(thrcad_t thread, int flavor, thrcad_ 

state_t new_state jnt *ncw_state_count) 

where 

thread — thread the slate of which is to be obtained 
flavor — machine specific flavor 

PPC_STATE_FLAVOR_LIST— list of flavors sup- 
ported by the ppc implementation 

PPC_THREAD_STATE-^fers to the tiiread's machine 
context except FP 

PPC„FLOAT_STA'IE— refers to the tiiread's FP context 

new_state— new state 

count— no. of natural storage imits for the state set 
Data structures 

floating point program exception handler— pr_slih rou- 
tine 

Floating point program exception handling portion of the 
pr_slih handler deals with the following data structures 
codes — the code array passed to the exception call. 
code_3ize — no. of elements that arc present in the code 
array 

The code array is filled as follows 
codes[0]=EXC_FLOAT_ARrrHMETIC; /*defined in 

machine specific exception, h include file */ 
codes[l]=EA; /* effective address of the instruction that 

caused the exception *y 
code_size=2; 

floating point kernel inlerface-[thread_set_state( ) & 
thread^et_state( )] 

1. thread_set_state with ppc_thread_state flavor 
thread's machine state in the thread's PCB. (thrcad->pcb- 

5pcb_rss) 

This is the ppc_saved_state structure of the thread's pcb 
and it is modified with the state information that the user has 
provided. The ppc_tiiread_state structure that is used as a 
handle to pass die state information is defined in machine 
specific include files. 

2. thrcad_set_state witii ppc_float_state flavor 
thread's machine float stale in the thread's PCB. (tinead- 

>pcb->pcb_fp) 
This is the fioatsave area of the thread's pcb that is set to 
the user provided state information. The ppc_float_state 
structure that is used as a handle to pass the state information 
is defined in machine specific include files. 

3. thread_get_state with ppc_thread_state flavor 

This does not alter the thread's data structure. It simply 
copies the thread's machine state information firom its pcb to 
the structure passed by the user. 
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4. ihread_gei_siatc with ppc_noai_state flavor 

This routine in turn calls the floa__get_state( ) routine 
which does the synchronization of floating point state infor- 
mation if the requesting thread is the floating thread meaning 
it stores the floating point hardware registers into the ^ 
thread's pcb floaisave area before it passes that information 
to the user consistent with lazy floaisave policy. It turns the 
"FP available" bit in the MSR to ofif. 
[Note: All the above routines call thread_hold( ) to suspend 
thread temporarily while modifying the thread's data struc- 
tures and call thrcad_release( ) after they are finished with 10 
modifying the state information] 

Functional Description 

pr_slih handler 

Function name: pr_slih( ) 

purpose: 15 
The pr_slih handler is invoiced for multiple exception 
conditions. So based on the reason passed to it by the FLIH, 
its control flow is altered. Tliis section describes floating 
point program exception specific logic of the pr_slih han- 
dler. ^ 
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1 , thread_get_state: to get the current state information for 
the thread for a machine specific flavor 

2. ihread_sei_state: to set the current state information for 
the thread for a machine specific flavor 

These two calls provide generic interface to the outer 
world by taking specific machine flavors and the corre- 
sponding state information as parameters. They in tuiti call 
machine specific routines that alter the pcb structure for the 
thread. They are 

1. thread_setstatus( ) 

2. thread_getsiatus( ). They are explained in the following 
sections. 

thread_set_state 
purpose 

To provide generic thread interface to deal with machine 
dependent hardware specific flavors and set the required 
state of the .thread according to the flavor 

prototype 

kcm_retuni_t thread_set_state(thread_t thread.int fla- 
vor,thread_state_t new_state,uint new_state_count) 



Prototype: void pr_slihCsmict ppc_savcd__sta:e ♦siaie, 
long srrl, 
long dsisr, 
long dar) 



Input: 

state: The machine state as saved upon kernel entry 

srrl : is the segment register SRRl 

dsisn dsisr register settings for the exception 

dar: is the dafa address register 

output: none 

return values: none 

error codes: 

routines invoked: panic( ), float_rcad_fpscr( ),exception 

() 

Logic 



Input 

thread: current thread's data structure 
flavor: machine flavor 

PPC_FLOAT_STATE 
PPC_THREAD_STATE 
(These are the only two flavors 
that arc currently supported) 

State: The machine state corresponding to 
the machine flavor 
count: byte count of state information (flxcd for each flavor) 



output: none 

return values: KERN_SUCCESS if successftil 
KERN_INVALID_VALUE if the flavor passed is not 

legal flavor value 
error codes: none 
routines invoked:tiiread_setstatus 
Logic 



begin 

if (problem state is supervisor mode) 
then 

panicQ; 

end 
chc 
begin 

switch (reason) 

begin 
case: 
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Begin 

if (thread cq NULL OR thread is the current 
thread executing) 

return (KERN_INVALID_ARGUMENT); 

call ihread__hold; / ♦ the thread is suspended ♦/ 

call thrcad_do_wait; /* wait until thread 

enters 'STOPPED' state */ 

call thread_sctstatus;/* call machine specific 

setstfltus routine */ 
call relcasfi_thread; 

end 



case FP„PROGRAM_EXCEPTION: 
begin 

set exception to EXC_AR1THMET1C; 
set codcsIO] to 

EXC_PPCLFLOAT_ARITHMETIC; 
set codcsl 11 = state->iar; 
codc_sizc = 2; 
break; 

end 

default: 

end 

end 

call except! on(exccption,cpdcs,codc_size); 

/* to raise an exception to the exception server in 
the exception port 

*/ 

end 



Kernel Interface 

The kernel interface essentially comprises of the follow- 
ing major routines in the thread library. They are namely 
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thread_get_state 
purpose: 

To provide generic thread interface to deal with machine 
dependent hardware specific flavors and gel the required 
state of the thread according to the flavor prototype: 

kem_retum_t thread_get_staie(thread_t thread,int fla- 
vor,thread_state_t new_state ,uinl *old_siate_count) 

Input: 

thread: current thread's data structure 



flavor: machine flavor 

PPC_FLOAr_STATE 
PPC,3 THREAD_STATE 
(These are the only two Ravors 
that ore currently supported) 
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State: The machine state corresponding to the machine 
flavor 
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count: byte count of state information (fixed for each 

flavor) 
output: none 

return values: KERN„SUCCESS if successful 
KERN_INVALID_VALUE if the flavor passed is not 

legal flavor value 
error codes: none 
routines invol£ed:thread_getstatus 
Logic 
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thread_getstatus 
purpose: 

The thread_getstatu3 routine based on the flavor 
requested, would appropriately get the registers in the 
machine state associated with the thread. Since this section 
particularly dwells on the floating point stale, it provides 
only the floating point pertinent information 

Prototype: 

kem_retum_i thread__getstatus(thread_t thread, int fla- 
vor,thread_state_t tstate.uint* count) 



begin 



if (thread cq t4ULL OR ihicad is the current 
ihxcad cxccudng) 

reluni (KERN_INVAL1D_ARGUMENT); 
call thrcad_liold; /* the thread is suspended ♦/ 
call thread_do_waii; /* wail until thread 
enters *STOPPED* state ♦/ 
call thrcad^getstatas'/* call machine specific 
sctstatus routine */ 
call n!leasc_ihread; 



end 
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Input: 

thread: current thread's data stiuetuie 
flavor machine flavor 

PPC_STATE^FLAVOR_UST 
PPC_FLOAT_STATE 
PPC_ THREAD_JSTATE 
(These are the only flavors that are cuixcntly 
supported) 

State: The machine state corresponding to the machine 
flavor 

count: byte count of state information (fixed for each flavor) 



lhrcad_sctstatus 
purpose: 

The ihread_setstatus routine based on the flavor 
requested, would appropriately set the registers in the ^ 
machine state associated with the thread. Since this section 
particularly dwells on the floating point state, it provides 
only the floating point pertinent information 

Prototype: 

kem„retum__t thread„setstatus(thread„t thread, int fla- 30 
vor,thread_state_t tstate,uint count) 



Input: 

thread: cuircnt thread's data structure 
Havor machine flavor 

PPC_FLOAT_STATE 
PPC_ THREAD_STATE 
(These are the only two flavors 
that arc currenUy supported) 

State: The machine state conesponding 

to the madiine flavor 40 
count: byte count of state infonnalion (fixed for each flavor) 



output: none 

return values: KERN_SUCCESS if successful 
KERN_INVALID_VALUE if the flavor passed is not 

legal flavor value 
error codes: none 
routines invoked: float_set_slate 



output: The state information requested 
the byte count of the state information 
return values: KERN__SUCCESS if successful 
KEEIN_INVALID_VALUE if the flavor passed is not 

legal flavor value 
error codes: none 
routines invoked: float_get_staie 
Logic 



begin 

switch (flavor) 
begin 

case THREAD_STArE_FLAVOR_USr: 
if (count <1) 

return (KERN_INVALID_ARGUMENT); 
istmefO] = PPC_THREAD_STATE; 
istateflj = PPC_FLOAT_STATE; 
♦count = 2; 
brealc; 

case PPC_THREAD_STATE; 

case PPC__FLOAT_STATE: 
begin 

if (count is < PPC_FLOAT_STATE_CXmNT) 

ictum (KERN_I>rVALID_VALUE); 
•count = PPC_FLOAT_STATE_COUNT; 
return (float__get_5tatc(thread,(stiuct PPC_float_- 

statc ♦)istate); 
end 

default; 

end 

end 



begin 



svvitch (flavor) 



begin 



case PPC_THREAD_STATE: 
case 

PPC_FLOAT_STATE: 
begin 

if (count is not equal to 
PPC_FLOATE_STATE_(X>UNT) 
return (KERN_INVA1JD_VALUE); 
return (fioal_set_statc 
(thTcad,(stTuct PPC_float_statc *)tstate); 

end 

default: 



end 
end 
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floai_set_staie 
purpose: 

The float_set_staie routine would appropriately set the 
floating point registers in the machine state associated with 
the thread 

Prototype: 

kem_retum_t float_set_state(thread_t thread, thread_ 
state_t tstate) 
Input: 

thread: current thread's data structure 
State: The machine state corresponding to the machine 
flavor 

Output: modified thread structure 

Return values: KERN_SUCCESS if successful 

KERN_FA1LURB otherwise 
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Error codes: none 
Routines invoked: none 
Logic 



begin 



end 



copy new floaling point state information istatc 
to the floatsnvc area 

of ihc thread's pcb: 
return (SUCCESS); 



float_gei_state 
purpose: 

The float_get_state routine would get the floating point 
machine state associated with the thread. This routine calls 
the float__sync_thread( ) routine to force a lazy save of the 
floating point state if the thread is the float thread. 

Prototype: 

kem_retum_t float„set_state(thread_t thread, thread_ 
state_t istate) 
Input: 

thread: current thread's data structure 
State: The machine state corresponding to the machine 
flavor 

output: requested tstate 

return values: KERN„SUCCESS if successful 
KERN_FAILURE otherwise 
error codes: none 

routines invoked: float_sync__thread( ) 
logic 
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begin 

if the thread is the floating thread 
begin 

call float_sync_thrcadO; 

end 

copy new floating point state information from the floatsavc 
area to tstate; 
return (SUCCESS); 
end 
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Errors and Messages 
1) Program Floating point enabled fault in kernel mode 

Since kernel does not make use of floating point, this fault 45 
is not expected to occur in kernel mode. 

Alignment Exceptions 

Overview 

This section illustrates various scenarios associated with 
an alignment exception in the PowerPC architecture. It deals 
with the alignment exception situations occurring in both 
little and big Endian modes. It also attempts to highlight the 
differences between MPC60] processor implementation and 
a PowerPC architecture and the instructions of PowerPC 
architecture that are not supported by 601 processor. It 
provides functional descriptions of die alignment exception 
handler. 

MPC-601 Information 

On the 601 processor, alignment exceptions occur under 60 
the following conditions: 

i) Any floating-point transfer with a non-memory forced I/O 
segment 

ii) Any transfer that crosses a segment or BAT boundary 

iii) A dcbz to a write-through or cache-inhibited area 65 

iv) A Iscbx transfer that crosses a page boundary 

v) Any misaligned transfer that crosses a page boundary 



A misaligned transfer is one in which the data is trans- 
ferred to an address that is not an integer multiple of the size 
of the data. A string or multiple transfer is considered 
aligned if the transfer starts on a word boundary. When 
operating in big-endian mode, the 601 processor handles all 
misaligned transfers transparently, except as listed above, by 
internally breaking the transfer up into several smaller sized 
transfers. Note that single byte transfers never cause an 
alignment excepdon. 

Additionally, when the 601 processor is operating in 
little-endian mode the following conditions will cause an 
alignment exception lo occur: 

i) Any misaligned transfer 

ii) Any load or store multiple or string operation 
PowerPC Information 

In addition to the conditions that may cause an aligrunent 
exception on the 601 processor, the PowerPC architecture 
specifies that the following conditions may cause an align- 
ment exception to occur: 

i) Any floating-point transfer that's not word-aligned 

ii) Any fixed-point doubleword transfer that's not word- 
aligned 

iii) Any Imw, stmw, Iwarx, or stwcx. transfer that's not 
word-aligned 

iv) Any Idarx, or stdcx. transfer that's not doubleword- 
aligned 

v) Any string transfer that crosses a page boimdaiy 
Support for operations not supported by the 601 processor 

is provided by die exception handler to provide full Pow- 
erPC compatibility. This involves adding branch out rou- 
tines into the dsisr jump table for the new instructions. See 
Appendix B for a list of PowerPC instructions that may 
cause alignment exceptions that are not supported by the 601 
microprocessor. Code in support of quadword floating-point 
loads and stores exists but will be conditionally compiled 
out in the 601 processor implementation. In addition to 
inserting the appropriate branch out routines into the dsisr 
jump table, new modules will have to be written to deal widi 
fixed-point doubleword operands and for handling the 
sifiwx, Iwa, Iwaux, and Iwax instructions. 

Some instructions are also interpreted differendy from the 
601 implementation than when implemented by a strict 
PowerPC processor. These differences will have to be deter- 
mined and analyzed in fiill detail when moving to a strict 
PowerPC architecture. As an example, load multiple and 
load string operations when the source register is within the 
range of the destination are permitted on the 601 processor 
but are considered invalid operations under a strict PowerPC 
implementation. Also, non- word-aligned load or store mul- 
tiples are invalid under the PowerPC architecture but are 
permitted by the 601 processor. 

Finally, the Iscbx insuiictions implemented by the 601 
processor are not part of the PowerPC architecture and 
future implementations will have to decide whether to treat 
these instructions as illegal instructions or to emulate them 
to remain backwards compatible. If it is decided that the 
Iscbx instructions will be emulated then the alignment 
exception handler code may be used for this purpose. 

Microkernel info 

The goal of the alignment exception handler is lo emulate 
the transfer for the user in a completely transparent and in as 
expedient a manner as possible. The alignment exception 
handler will break up the transfer into smaller sized transfers 
that will not cause alignment exceptions. 
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In the process of emulation, memory proteciion mecha- 
nisms will be enforced as if the user-level program was 
performing the transfer rather than the supervisor-level 
exception handler. To enforce this restriction, the exception 
handler will check for and prevent access lo the kernel 5 
segments. The exception handler will raise a data access 
exception for any such potential access. 

Also, it will be assumed, and verified through a code 
review of the virtual memory support code, that the Kp and 
Ks bits for the user segments will always be set to the same 
value. 

Note that any and all multiple and string operations will 
invoke an aligimient exception when operating in little- 
endian mode. As suoh, these instructions should never be 
produced by any little-endian PowerPC compiler. These 
instructions will not be emulated in little-endian mode and 
will raise an illegal instruction exception instead. 

Areas of code that are big-endian specific will be inclosed 
in the following conditional inclusion preprocessor state- 



#if (BYTE_ORDER = BIG_ENDIAN) 

ifcndif f* (BYTE_ORDER = B1G_ENDIAN) */ 



Areas of code that are little-endian specific will be 
enclosed in the following conditional inclusion preprocessor 
statements: 



#if (BYTE_ORDER = LITTm^ENDIAN) 

itendif /* (BYTE_ORDER LnTLE_ENDIAN) /♦ 



The BYTE_ORDER token will be defined as a compiler/ 
preprocessor command line argument. The value used for 35 
B YTE__ORDER will be determined through Makefile target 
selecUon. The tokens BIG_ENDIAN and UTTLE_EN- 
DIAN are defined in the header file mach/endian. h. 

[As indicated in 'TowerPC Operating Environment 
Architecture, Book IH", software should not attempt to 40 
obtain a reservation for unaligned Iwarx (orldarx) operands, 
nor to simulate an unaligned stwcx. (or stdcx.). For this 
reason these events will not be emulated and will raise an 
alignment exception instead] 

Alignment exception handling — user choice 45 

Sometimes specific application and system scenarios 
require that the system not handle the alignment exceptions 
every time they occur but simply notify the application of 
the same. This is done primarily for performance reasons. 
The application this way has the ability to choose the best 50 
way to handle the alignment problems as opposed to trap- 
ping into the kernel. To facilitate this, functionality is 
provided such that a thread can register itself to be notified 
by the system in the event of an occurrence of a alignment 
exception. Since then, the application may choose to switch 55 
to byte memory access which wiU not cause alignment 
exceptions. 

External interface 

Since the goal of the alignment exception handler is to 
provide transparent resolution of the exception there is no 60 
external interface required. Putting this aside, it may be 
desirable to provide a mechanism for informing the devel- 
oper of code thai produces misaligned transfers. There are 
two mechanisms which would be useful for relaying this 
information to the developer. The first is to insert trace hooks 65 
into the exception handler when PowerPC assembly lan- 
guage trace hook macros become available. The second 
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method is to implement a special flavor of thread_state that 
indicates that misaligned transfers are to raise an exception. 
Only misaligned transfers, not boundary crossings, would 
cause an exception to be raised. This mechanism will not be 
implemented as part of this design, and is only mention here 
as a possible future enhancement 

Functional Description 

specifications 

The low memory veaor address for the alignment handler 
is at offset 0x600 from the base address indicated by the 
setting of the MSR[IP] bit Upon entry lo the alignment 
handler, the machine is in die following slate: 

i) External Interrupts are disabled. 

ii) Processor is privileged to execute any instruction. 

iii) Processor can not execute any floating point instructions, 
including floating-point loads, stores, and moves. 

iv) Floating point exceptions are disabled. 

v) Instruction address translation is oS*. 

vi) Data address translation is off. 

vii) SRRO contains the address of the instruction causing the 
exception. 

viii) SRRl contains bits 16-31 of the MSR. 

ix) DAR contains the starting transfer address for the 
operation that caused the exception. 

x) DSISR contains selected bits of the instruction for decod- 
ing the type of instruction that caused the exception. 
Alignment exceptions will be treated as non-context 

switching events which are only invoked from user-level 
(problem mode) programs. Tb expedite processing and to 
prevent nesting the following policies will be implemented: 

i) the alignment exception handler will avoid a full state save 
and will only save those registers used or affected by the 
exception handler code. 

ii) external interrupts will remain disabled. 

iii) instruction translations will remain disabled. 

iv) data translations will remain disabled except as necessary 
to perform the unaligned load or store. 

v) AST checks will not be performed on return from the 
exception handler. 

vi) The only exception that should occur during alignment 
handler execution is a data access exception while per- 
forming the unaligned load or store. 

vii) Handler code segment and private cpu save area must be 
accessed in real mode (translations off)* 

viii) An exception will be raised immediately for tiie fol- 
lowing cases: Effective address witiain kernel segment 
(EXC_BAD_ACCESS/KERN_INVALID_AD- 
DRESS). unaligned Iwarx, Idaix, stwcx., stdcx. operands 
(EXC_HW_EMULAT10N/EXC_PPC_ALIGN- 
MENT), attempted execution of Iswi. Iswx. stswi, stswx, 
Iscbx, iscbx., Imw, or stmw while in littie-endian mode 
(EXC_JBAD_INSTRUCTION/EXC_J'PC_ 
BEOPONLY) 

Handler Design 

It is possible for the alignment handler to cause a data 
access exception due to a page fault or protection violation. 
This is handled with a special dependence on the data access 
exception handler. The data access exception handler must 
determine if the exception was caused by the alignment 
exception handler by checking die MSR[IT] bit in the SRRl 
register. If Uiis bit is clear, then the data access exception 
handler resolves the fault condition, backtracks to the origi- 
nal machine state prior to the aligrunent exception by 
restoring state saved by the alignment exception handler, 
and restarts the original instructioa This will result in 
another alignment exception, but this time no data access 
should be generated since the page fault condition has been 
resolved. 
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FIG. 8 is a flow diagram of the alignment exception 
handler 194, which is pan of the PowerPC exception handler 
190. The steps are as follows: 

1) Entry at physical address 0x600. 

2) Temporarily save a woric register into SPR_GO. 5 

3) Get address of cpu_vars. fh__save_area from the SPR_ 
CPU register. 

4) Convert virtual address of fh_save_area into a physical 
address. 

5) Save registers used or affected by exception handler _ 
(GPR25 through GPR31, LR. CR, XER, SRRO, and 
SRRl). 

6) Move copies of DSISR, DAR, and MSR into work 
registers. 

7) Assert that processor was in problem mode at time of 
exception. 

8) Check address bounds of operation against kernel virtual 
address space. 

9) Move DSISR into CR for bit tests. 

10) Branch into instruction decode (dsisr) table based on 
DSISR[15-21] 20 

11) Execute appropriate submodule (submodule descrip- 
tions are given in the following submodules section) 

12) Restore saved state and return to user mode. 
Alignment Handler — Sub modules 

Fixed Point Load Module: 25 
This module handles all of the fixed point icad instruc- 
tions. The appropriate number of bytes (2 or 4) are loaded 
individually and reassembled into a scratch register, manipu- 
lated as necessary if a byte-reverse or algebraic operation. 
Then, the load table is used to move the data to the 
appropriate target register. Finally, a check for update form 
is performed and the address register updated with the 
effective address of the instruction as appropriate. 
Fixed Point Store Module: 

This module handles all of the fixed point store instruc- 
tions. The store table is used to move the data from the 
source register to a scratch register. Then, the data (2 or 4 
bytes) is stored to the target address one byte at a time, 
manipulating the data as necessary for by te_reversed opera- 
tions. Finally, a check for update form is performed and the 
address register is updated with the effective address of the 40 
instruction as appropriate. 

Floating Point Load Module; 

This module handles all of the floating point load instruc- 
tions. The appropriate number of bytes (4, 8, or 16) are 
loaded from the source address individually and reas- 45 
sembled into scratch regisier(s) and written to the local save 
area. The floating point table is then used to move the data 
from the save area to the appropriate target floating point 
register(s). Finally, a check for update form is performed and 
the address register updated with the effective address of the 50 
instruction as appropriate. 

Floating Point Store Module: 

This module handles all of the floating point store insuuc- 
tions. The floating point tabic is used to move the appro- 
priate number of bytes (4,8, or 16) from the floating point 55 
source register to the local save area. Then, the data is 
written to the target address 1 byte at a time. Finally, a check 
for update form is performed and the address register 
updated with the effective address of the instruction as 
appropriate, 60 

Load Multiple and Load String Module: 

This module handles the move assist load string instruc- 
tions as well as the load multiple instruction. The length of 
data to be transferred is acquired, and then the data is loaded 
a byte at a time and reassembled into a scratch register. 65 
When the scratch register is full, the load table is used to 
move the data to the appropriate target register. If the target 
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register ever overlaps the address register, the 4 bytes 
targeted for that register are ignored. 
NOTE: In the case of the load string immediate, the actual 
instruction will have to be fetched in order to determine the 
length of the operation. 

Store Multiple and Store String Module: 

This modiJe handles the move assist store string instruc- 
tions as well as the store multiple instruction. The length of 
data to be transfened is acquired, and then the data is moved 
4-bytes at a time via the store table to a scratch register, 
which is then written 1 byte at a time to the target address. 
NOTE: In the case of the store string immediate, the actual 
instruction will have to be fetdied in order to determine the 
length of the operation. 

Load String and Compare Module: 

This raodtile handles only the load string and compare 
byte instruction. Bytes are loaded 1 at a time and compared 
against the match byte of the XER. When a match is found, 
or the maximum length as specified in the XER is reached, 
the resulting length field of the XER is updated and if this 
instruction was a record form, the appropriate Condition 
Register field is updated. 

NOTE: The actual instruction will have to be fetched in 
order to determine the setting of the record mode bit. 

Data Cache Block Zero Module: 

This module handles only the data cache block zero 
instruction. The cache block boundaries are determined 
from the target address, and the resulting block of memory 
is cleared. 

Data Structures 

Data structures required to suppon the alignment handler 
are all accessed though the system special purpose register 
cpu data pointer. The design requires modification of the 
cpu_vars structure to include the fast exception save area 
and the physical addresses of the various aligrmient handler 
jump tables (dsisr, update, load, store, floating-point ops). 

Each CPU must have its own private fast handler save 
area. The size of the fast handler save area is 64 bytes and 
must be quadword aligned. The fast handler save area will 
be at the beginning of the private cpu data structure cpu_ 
vars referenced as element fh_save_area. The layout of the 
fast handler save area is as follows: 



stnia fh_savc_.arca { 

unsigned long fh_ficratclil; 

unsigned long fh_5crau:h2; 

unsigned long fh_5cratch3; 

unsigned long fh_scratcb4; 
unsigned long fh_gpr25; 

unsigned long fb^pr26; 

unsigned long fh_gpr27; 

unsigned long fh_gpr28; 

unsigned long fh_^pr29; 

unsigned long fb^gpt^O; 

unsigned long fh_gpr31; 

unsigned long fh_snO; 

unsigned long fh_snl; 
unsigned long fh_Jr, 

unsigned long fh_ar; 

unsigned long fh_xcr; 

>; 



The cpu_vars private cpu data structure will also be 
modified to contain the physical addresses of the five 
alignment handler jump tables. The five alignment handler 
jump tables are comprised of: the initial dsisr jump table 
which determines the instruction to be emulated; the fixed- 
point load tabic indexed by target register; the fixed-point 
store table indexed by source register; the update table used 
to update the rA register of the instmction; and the floating- 



06/11/2004, EAST Version: 1.4.1 



5,606, 

33 

point operation table which is indexed by instruction and the 
target or source floating-point register. 
Errors/Messages 

Any eiTor condition encountered during processing of the 
alignment exception will be considered a catastrophic sys- 5 
tern failure which will result in a panic. The only anticipated 
source of error is possibly kernel code making unaligned 
accesses which is to be considered a bug. An assert check for 
kcmcl-Ievcl invocation will be used to identify this condi- 
tion. Unused jump entries in the dsisr table will point to 
panic code, but these entries will only be accessed in the 
event of a processor micro-code failure. 

The resulting exception handling method and apparatus 
invention provides improved eflBciency in the operation of a 
PowerPC processor ruiming a microkernel operating system. 

Although a specific embodiment of the invention has been 
disclosed, it will be understood by those having skill in the 
art that changes can be made to that specific embodiment 
without departing from the spirit and scope of the invention. 

What is claimed is: 

1. An article of manufacture for use in a data processing 20 
system including a memory and a processor that has a 
plurality of fixed point registers and a plurality of floating 
point registers, comprising: 

a computer useable medium having computer readable 
program code means embodied therein for providing a 25 
method for managing process threads that are to be 
executed by the processor, the computer readable pro- 
gram code means in said article of manufacture com- 
prising: 

computer readable program code means for causing a 30 
computer to create a process thread in the memory to be 
executed by the processor, and a process control block 
in the memory to store thread information; 

computer readable program code means for causing a 
computer to store in the process control block a non- 35 
floating point indication that the process thread is not 
enabled to perform floating point operations; 

computer readable program code means for causing a 
computer to execute during a first occurring session, 
only fixed point operations with the process thread in 
Uic processor using the plurality of fixed point registers; 

computer readable program code means for causing a 
computer to remove the process thread from the pro- 
cessor at a termination of the first session and storing 
first values of the fixed point registers in the process 
conu-ol block and, in response to said non-floating point 
indication, not storing the contents of the plurality of 
floating point registers in die process control block; 

computer readable program code means for causing a 
computer to restore the execution of the thread in the 
processor in a second occurring session by detecting 
said non-floating point indication in the process control 
block, and in response tiiereto, performing a lazy 
context restore operation by loading said first values 
from the process control block into the plurality of 
fixed point registers and not loading the plurality of 
floating point registen of the processor; 

computer readable program code means for causing a 
computer to execute during said second occurring eo 
session, fixed point operations with the process thread 
in the processor using the plurality of fixed point 
registers; 

computer readable program code means for causing a 
computer to attempt to execute a floating point instruc- 63 
tion in the process thread during said second session, 
and in response thereto, calling an exception handler; 
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computer readable program code means for causing a 
computer to use said exception handler 10 store an 
alternate floating point indication in the process control 
block, to indicate that die process thread is enabled to 
perform floating point operations; 
computer readable program code means for causing a 
computer to resume execution of said floating point 
instruction in the process thread; 
computer readable program code means for causing a 
computer to remove the process thread from the pro- 
cessor at a termination of said second session and 
storing second values of the plurality of floating point 
registers in the process control block in response to said 
alternate floating point indication; and 
computer readable program code means for causing a 
computer to restore the execution of the process thread 
in the processor in a third occurring session by detect- 
ing said alternate floating point indication, and in 
response thereto, performing a lazy context restore 
operation by loading said second values from the 
process control block into the plurality of floating point 
registers of the processor. 
2. An article of manufacture for use in a data processing 
system including a memory, a first processor that has a first 
plurality of fixed point registers and a first plurality of 
floating point registers, a second processor that has a second 
plurality of fixed point registers and a second plurality of 
floating point registers, comprising: 

a computer useable medium having computer readable 
program code means embodied therein for providing a 
method for managing a process thread that is to be 
executed by the processors, the computer readable 
program code means in said article of manufacture 
comprising: 

computer readable program code means for causing a 
computer to create the process thread in the memory to 
be executed by the first processor, and a process control 
block in the memory to store thread information; 

computer readable program code means for causing a 
computer to store in the process control block a non- 
floating point indication tiiat the process thread is not 
enabled to perform floating point operations; 

computer readable program code means for causing a 
computer to execute during a first occurring session, 
only fixed point operations with the process thread in 
the first processor using the first plurality of fixed point 
registers; 

computer readable program code means for causing a 
computer to remove the thread from the first processor 
at a termination of the first session and storing first 
values of the first plurality of fixed point registers in the 
process control block and, in response to said non- 
floating point indication, not storing the contents of the 
first plurality of floating point registers in tiie process 
control block; 

computer readable program code means for causing a 
computer to restore the execution of the process thread 
in the second processor in a second occurring session 
by detecting said non-floating point indication in the 
process control block, and in response thereto, perform- 
ing a lazy context restore operation by loading said first 
v^ues from the process control block into the second 
plurality of fixed point registers and not loading the 
second plurality of floating point registers of the second 
processor, 

computer readable program code means for causing a 
computer to execute during said second occurring 
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session, fixed point operations with the process thread 
in the second processor using the second plurality of 
fixed point registers; 

computer readable program code means for causing a 
computer to attempt to execute a floating point instruc- ^ 
tion in the process thread during said second session, 
and in response thereto, calling an exception handler; 

computer readable program code means for causing a 
computer to use said exception handler to store an 
alternate floating point indication in the process control 
block, to indicate that the process thread is enabled to 
perform floating point operations; 

computer readable program code means for causing a 
computer to resume execution of said floating point 
instruction in the process thread in the second proces- 
sor; 
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computer readable program code means for causing a 
computer to remove the process thread from the second 
processor at a termination of said second session and 
storing second values of the second plurality of floating 
point registers in the process control block in response 
to said alternate floating point indication; and 

computer readable program code means for causing a 
computer to restore the execution of the process thread 
in the second processor in a third occurring session by 
detecting said alternate floating point indication in the 
process control block, and in response thereto, perform- 
ing a lazy context restore operation by loading said 
second values from die process control block into the 
second plurality of floating point registers of the second 
processor. 

* * Hi * * 
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